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ERRATA 

p.  15,  formula  33,  append  dots. 

p.  19,  line  17,  delete  comma  between  symbols. 

p.  21,  line  8  (second  expotential),  to  exponent  add  terms 

+  '"i-i  ■'"■'"  "l  "2  +  ''l3  *'^  «!  «:i  • 

p.  26.  For  the  footnote  substitute:  Tables  of  the  Incomplete  Gamma  Function, 
computed  by  the  Staff  of  the  Department  of  Applied  Statistics,  University  College, 
London,  have  this  year  been  published  by  H.  M.  Stationery  Office  for  the  Depart- 
ment of  Scientific  and  Industrial  Eesearch. 

p.  28,  line  18,  append  comma. 

p.  39,  formula  79,  replace  first  dx  by  dr. 

p.  41,  formula  82,  replace  27r  by  oo  . 

p.  44,  line  5,  for  d6  read  ddj2ir. 

pp.  42 — 44,  §  13.  The  reasoning  is  faulty  for  the  second,  general,  case  since 
e,  and  t)^  are  then  functions  of  r,  say  e^  (r)  and  ??,  (t).  For  the  proof  of  this  case  it 
is  simpler  to  write  (7/coss^  +  f/s"sins^  for  ggCOs{sd  +  y^  and,  making  like  sub- 
stitutions throughout,  the  analysis  leads  to  the  same  final  result  (92)  in  which,  now, 
7?s  must  assume  the  meaning  -q^  {ix). 

p.  46,  note  t,  lines  6  and  9,  for  ix  ('")  I'ead  x  ('")/'• 
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ERRATA 


p.  16.   In  the  integral  at  end  of  note  replace  limits  0  to  x  by  limits  x  to  oo  . 
p.  19.   Replace  right  hand  of  last  equation  by 

p.  46,  notct.    Throughout  the  note  for  coshxf  read  2  cosh  a:^,  for  sinha;^  read 
2sinh.r^,  for  ^(X^  +  X-^)   read   A'»^  +  Z-«,   for   i(Z^-A'-^)   read  .Y^-A'-^,   for 

-  read  - . 


I.   INTRODUCTOHY 

1.  Since  in  other  branches  of  science  symbols  bearing  an  objective 
or  logical  significance  have  been  usefully  employed,  conjoined  with 
symbols  of  number,  to  express  quantity,  it  may  be  expected  that  in  the 
science  of  statistics,  which  of  its  essence  is  the  enumeration  of  logical 
classes,  such  symbols  will  find  serviceable  application. 

In  what  follows  symbols  having  a  logical  but  no  numerical  inter- 
pretation are  frankly  admitted  into  the  mathematical  expressions  of  the 
counts  or  distributions  of  frequency  and  the  consequences  of  this  course 
are  followed  up. 

It  will  appear  that,  by  ascribing  to  such  symbols  the  laws  of  common 
algebra  in  their  combination,  the  description,  analysis  and  derivation  of 
frequency  distributions  are  often  much  simplified. 

2.  If  a  character,  which  may  be  spoken  of  as  the  character  A,  is 
divided  into  s  categories,  or  grades,  designated  by  the  symbols 

and,  in  a  population  of  /  individuals,  the  number  /i  are  placed  in  the 
grade  Ai,  the  number  L  in  the  grade  ^2  and  so  on,  the  enumeration  of 
the  population  as  regards  this  character  may  be  shown  in  the  single 

expression 

lUi  +  hA.+  .-.+LAs,    (1) 

where  I,  +  L_+  ...  +  l^=  I. 

Dividing  by  I  and  denoting  by  pi,p.  ■■■  the  fractions  or  frequencies 
li/l,  kjl ■■■■>  we  obtain 

2hAi+}hA^_+  ...  ^2hA,,  (2) 

the  'frequency  array'  of  the  grades  ^1,  A....  in  the  population. 

Now  if  from  this  population  random  samples  of  n  are  taken  (by 
n  repetitions  of  a  draw  and  a  return  to  the  stock)  the  samplings  may 
eventuate  in  I"  ways  and,  by  the  distributive  law  of  algebra,  these  are 
arrayed  in  the  ^"  terms  of  the  product 

{hA,  +  l,A-2+  ■■■+hA:)" (3) 

Since  each  way  of  sampling  is  equally  likely  to  occur,  dividing  by 
/",  the  number  of  ways,  we  obtain 

{p,A,+p,A,^...+lhAsT, (4) 

the  '  frequency  array '  of  samples  of  11. 
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If  like  terms  are  collected,  the  expansion,  by  the  nialtinomial 
theorem,  is 

^Pn„n,...n,A,''^AJ''^...AJ\     (5) 

n ' 
where  pn, ,  n.-,  ...n=  —, — r ;  lh''^ip-2"-^  ■  ■  ■  Ps^h,  (6) 

;?j,  y?.j ...  having  all  values  from  0  to  n  subject  to 
n-i  +  «2  +  •••  +  ''-s  =  n. 

Here  the  coefficient  Pm^n-i.-.n^  is  the  frequency,  or  probability  of 
occurrence,  of  the  draw  characterised  as  containing  n^  individuals  of 
the  kind  Ai,  n.2  individuals  of  the  kind  A-^,  and  so  on,  and  sym- 
bolised by 

A,'hAo^h...AJ'^ (7) 

It  will  be  seen  that  the  constitution  of  the  samjile,  as  shown  in  (7), 
differs  in  its  mode  of  expression  from  the  constitution  of  the  population, 
as  shown  in  (1),  in  that  the  individual  constituents  are  associated  in 
product,  instead  of  in  sum,  the  number  of  each  kind  of  constituent 
appearing  as  an  index  instead  of  as  a  coefficient. 

Now  it  is  clear  that  this  way  of  expressing  constitution,  that  is  to 
say  by  symbols  associated  in  product,  which  is  presented  as  the  natural 
consequence  of  invoking  the  distributive  law  of  algebra  for  exhibiting 
and  enumerating  the  possible  samples  of  n  drawn  from  a  population, 
may,  by  convention,  be  extended  to  indicate  the  constitution  of  a  unit 
in  the  several  characters  it  possesses,  whether  these  characters  are  re- 
garded as  the  result  of  a  sampling  process  or  as  given  in  any  other 
manner. 

Any  population,  then,  formed  of  an  aggregate  of  units  of  diverse 
constitutions,  can  be  spoken  of  as  shown  in  '  sample '  array  when  ex- 
pressed in  a  form  such  as  (5),  that  is  to  say,  in  the  form 

^Pa.c.A'^B'C^...,   (8) 

to  indicate  that  Pabc...  is  the  frequency  of  occurrence  of  an  individual 
possessing  the  character  A  in  the  degree  a,  the  character  B  in  the 
degree  b,  and  so  on. 

In  the  extended  definition  given  to  the  symbols  in  (8)  it  is  not 
necessary  that  the  indices  have  the  integral  values  proper  to  (5). 
According  to  the  natures  of  the  several  characters  ^,  B,  C ...  and  their 
mode  of  measurement,  the  indices  may  take  fractional  and  negative 
values  and,  if  any  of  the  characters  are  continuous  variates,  signs  of 


INTRODUCTORY  9 

integration  will  replace  signs  of  summation.  Thus  if  all  are  continuous 
variates  the  frequency  array  (<S)  will  assume  the  form 

lJj.../{a,b,c...)A''B''C\..dadbdc...,  (9) 

with  the  meaning  thsit  /{a,  b,  c  ...)  dadbdc  ...  individuals  are  enu- 
merated as  possessing  the  combination  of  measurements  a  in  the 
character  A,  b  in  the  character  B,  and  so  on. 

3.  It  may  be  observed  at  this  point  that  the  introduction  of  the 
symbols  into  the  expressions  describing  statistical  counts  supplies  a 
feature  which  enables  the  whole  distribution  to  be  exhibited  in  a  single 
expression  without  ambiguity  or  reserve.  Thus  the  binomial  and 
Gaussian  frequencies,  expressed  as 

U=(p'+pA)'\     (10) 

V-  ,-     ''^%     (11) 

J  — Qo      sjzircr 

are  complete  embodiments  and  may  enter  as  they  stand  into  mathe- 
matical analysis.    Without  the  symbols 

;-Qo  0'  '^^"1'^' dx 
(p'+pT  and         j~ 

y -00      yj  ZTTcr 

stand  only  for  unity  until  the  reservation  is  made  that  the  conjunctive 
signs  between  the  expanded  terms  or  differentials  are  to  be  ignored. 

4.  As  a  simple  example  of  the  advantage  that  may  accrue  from 
assembling  all  the  eventualities  in  a  single  expression,  let  us  apply 
arrays  to  the  well-known  problem  of  'points.'  A  and  B  play  a  game 
for  a  point,  in  which  ^'s  chance  of  success  is  p  and  of  failure  p'  =  1  -p. 
The  game  is  repeated  as  often  as  A  wins  and  the  contest  closes  when 
A  loses.    What  are  ^4's  chances  of  winning  0,  1,  2  ...  points? 

The  symbol  A  indicating  a  win  to  the  player  A,  let/{^)  be  the 
required  frequency  array  of  total  points  won  by  A  in  a  contest,  sup- 
posing the  contests  continued  without  limit.    The  chances  of  the  first 

game  are  shown  as  ,         , 

^  p  +pA, 

indicating  that  p'  is  the  chance  of  an  initial  loss  and  p  the  chance  of 

an  initial  win. 

In  the  latter  event  the  chances  of  subsequent  wins  are  arrayed  as 

•^'^^^-  .■.f{A)=p'-^pA/{A). 


10  INTRODUCTORY 

If  the  conditions  are  varied,  the  contest  closing  after  n  losses  to  A, 
we  are  sampling  the  array  (12)  n  times  in  succession,  n  such  samples 
bringing  ^'s  losses  up  to  n.  Thus  ^'s  expectations  of  winning  0,  1,  2 ... 
points  in  a  contest  under  these  conditions  are  arrayed 

1      p 


A)     ,    (13) 

P     P 


the  negative  binomial  series  of  frequencies. 


II.   ELEMENTARY  PROPOSITIONS.   MOMENT  ARRAYS 

1.  Attention  will  now  be  directed  to  a  few  elementary  propositions 
applicable  to  frequency  arrays  as  above  defined,  which  may  be  described 
as  complete  statistical  enumerations,  arrayed  as  a  series  of  terms,  each 
term  comprising  a  niimeratm\  or  count,  multiplying  a  denominator  or 
arrangement  of  symbols  in  product  formation  to  indicate  the  characters 
borne  and  their  amounts. 

2.  Clearly,  if 

J\A)  =  ^PrA\   or  f{A)=  \p{x)A-dx, 

are  arrays  in  A,  multiplication  by  A"\  or  A~'",  respectively  augment, 
or  reduce,  all  the  indices  in  the  array  by  m  and  so  throw  back,  or 
advance,  the  origin  of  measurement  of  A  m  units.    For  instance,  if  m  is 
the  mean  of  the  array /(^), 

^-'"/(^) 
will  represent  the  array  referred  to  its  mean  as  origin  of  measurement. 

......(14) 

Again  the  substitutions  A  =  Ai%  or  A^  =  Ai,  respectively  diminish, 
or  increase,  the  unit  of  measurement  s  times.  For  instance,  if  o-  is  the 
standard  deviation  of  the  array /(^4),  by  putting  A"^  =  ^i,  we  get 

as  the  representation  of  the  array  with  unit  of  measurement  so  changed 
as  to  make  the  s.d.  equal  to  unity.  (15) 

If  [/-/(A,  B,  C ...)  array  the  frequencies  with  which  varying 
quantities  of  ^,  ^,  C  ...  are  found  in  the  units  of  a  population,  U"'  will 
array  the  frequencies  with  which  total  A,  total  B  ...  will  be  found  in 
samples  oi  n  drawn  at  random  from  U  (with  replacement  if  the  popu- 
lation is  not  a  large  multiple  oi  n).  (16) 

In  like  manner  U'V^W'^ ...  will  array  the  totals  of  the  characters 
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A,  B,  C ...  found  in  samples  of  l  +  m  +  n+  ...  drawn,  I  from  a  population 
U,  m  from  a  population  V  etc.  possessing  these  characters (17) 

If  samples  of  varying  numher  are  drawn,  with  replacement,  from 
f/,  the  number  r  being  drawn  with  frequency  p,.,  then  '  number  in 
sample '  is  a  character  which,  if  its  symbol  be  <S',  has  the  array 

<^{S)=p,8+p.,^''-^lHS'^ (18) 

By  (16),  if  Z7is  put  for  >S'  in  the  right-hand  side  of  (18),  we  shall 
have,  arrayed,  the  frequencies  of  total  A,  total  B  etc.  in  the  variously 
sized  samples. 

Hence  if  U  is  sampled  in  variable  numbers,  the  number  in  sample 
having  the  frequency  array  <^  (>S'),  the  totals  of  the  characters  to  be 
found  in  the  samples  will  have  the  frequency  array 

<^iJJ) (19) 

In  any  frequency  array  the  suppression  of  symbols  gives  the  fre- 
quencies of  the  larger  classes.  For  instance,  if/(^,  B,  C)  arrays  the 
frequencies  of  A,  B,  C,  putting  5-  1,  C-  1  gives/(^,  1, 1)  the  total  fre- 
quencies of  ^4 .  If  all  the  symbols  are  suppressed  in  a  frequency  array  the 
result  is,  of  course,  the  total  frequency,  generally  taken  as  unity.  . .  .(20) 

3.  Moment  arrays.  Of  special  utility  are  the  symbols  we  have  con- 
sidered in  the  development  of  the  moment  coefficients  of  frequency 
distributions*. 

*  Smooth  mathematical  frequencies  are  often  fitted  to  the  rough  frequencies  of 
samples  by  equating  the  moments.  Perhaps  the  justification  rests  in  the  following 
proposition. 

If  the  frequencies  to  be  fitted  liave  coefficient  of  the  form 

f=gC,  +  S,.,,,„Cr,s...^''>f---       (i) 

(including  the  general  Gaussian),  such  fitting  by  moments  will  render  the  probability 
of  the  sample  a  maximum  for  the  given  classification,  d.vdy...-f,  that  is,  will  produce 
the  'best'  fit. 

For  if  n  repeats,  in  product,  values  for  the  n  units  of  the  sample,  and  ?»,.,s...  stands 
for  the  r,s...  moment  coefficient  of  the  sample,  the  probability  of  the  sample  is,  by  (6), 

7i!ne^o  +  ^''r.8...«^''r'-(fterf?/---)"="!<^'"'»"^"^'''''"^-"''''-^-('^^f'i/---)"> 
and  is  a  maximum  when  Cn  +  Sc,,^,..  m,.^,,„  is  a  maximum.   Hence  variations  5c„...  in 
the  constants  must  conform  with 

dcQ  +  Smr,s...d\,...  =  0. 

But  since   lfdx(hj...  =  l  the  variations  must  conform  with 
8Cq+  Sm'r,s...  5(V,s,..=0' 
where  m'r  s     is  the  )•,  s  ..  moment  of  the  assumed  universal  (i). 

Hence  m'v  ,  =  «v, «  or,  for  best  fit  of  (i)  to  a  series  of  observations,  the  moments, 
corresponding  to  the'te'rms  of  the  polynomial  exponent,  must  be  equal  in  universal 
and  sample. 

+  There  are  no  'probabilities'  until  the  classification  is  specified. 
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In  the  first  place,  considering  a  single  variate,  if 

f{A)  =  %p..A^ (21) 

X 

arrays  the  frequencies,  p^,  with  which  the  measures,  .r,  of  the  character, 
A,  occur  in  a  population,  the  substitution  in  (21)  of 

A^e'^  (22) 

gives  y ((?")  =  2 p,, &''"■  =  2 p.c  (l  +  .va  +  .r-  2 ,  +  ■  ■  •  j 

=  1  +  m,a  +  iru  —■+  ...  +mr~,  +  ...,  (2d) 

which  may  be  termed  the  'moment  array'  of  the  character  A,  the  rth 
moment,  m,.,  of  the  distribution,  about  the  origin,  being  the  coefficient 
of  a^jr !  in  the  array. 

The  same  is  true  if  A  is  a  continuous  variate  arrayed  as 

f{A)  =  jp{x)A^dx. 

In  the  second  place,  if  there  are  several  characters  A,  B,  C ...,  the 
substitutions,  A=^,  B  =  e^  ...,  render 
A''B^'C'...=e'"'e"-^e'^ ... 

and  render  the  frequency  array  an  array  of  moments  whose  symbols 
betoken  the  kind  of  moment  and  whose  coefficients  record  the  value  of 
the  moment. 

The  '  moment  array,'  then,  is  the  result  of  substituting 

A  =  e%  B  =  e^   C=e^  ... 
in  the  'frequency  array'  and  has  this  property  that,  when  expanded  in 
integral  powers  of  the  new  symbols  a,  (3,y  ...,  it  gives  the  r,  s,  t  ...^ 
moment  of  the  characters  ^ ,  ^,  C . . . ,  about  the  origin,  as  the  coefficient  of 

■^'^V---!        (24) 

r\s\t\... 

in  the  expansion. 

It  will  be  convenient,  in  the  seciuel,  to  express  arrays  indifferently 

in  terms  of  the  capital  letters,   signifying  character  units,  or  in  terms 

of  the  corresponding  Greek  letters,  having  an  interpretation  in  reference 

to  moments ;  and  it  will  be  understood  that  the  symbol  A  may  be 

replaced  by  e"  and  the  symbol  a  by  log^  in  the  course  of  any  process, 

if  it  is  desirable  to  do  so,  the  expression  still  being  called  the  '  array.' 
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In  the  end  an  array  may  be  expressed  in  character  symbols  in  the 
form/(^4)  which  when  expanded  in  integral  or  fractional  or  differential 
powers  of  A  arrays  the  frequencies  ;  or,  it  may  be  expressed  in  moment 
symbols,  in  the  form  ^  (a),  which,  when  expanded  in  positive  integral 
powers  of  a,  arrays  the  positive  integral  moments  of  the  distribution. 

4.  When  arrays  are  expressed  in  the  moment  symbols,  propositions 
(14),  (15)  and  (20)  may  be  replaced  by  (25)  and  (2G) : 

Change  of  origin  to  m  units  in  advance  is  effected  by  multiplication 
of  the  array  by  t'~'"",  whilst  the  substitution,  sa  =  s'a,  effects  a  change 
of  unit,  s  units  of  the  character  ^4  having  as  equivalent  s'  units  of  the 
character  A'.  (25) 

Suppression  of  symbols  to  give  the  moment  arrays  of  the  larger 
classes  is  effected  by  writing  them  zero.  For  example,  if  ^  (a,  (3,  y)  is  a 
moment  array,  expressing,  by  expansion,  the  moments  in  the  characters 
A,  B,  C,  suppression  of  /3,  y,  by  writing  /5  =  0,  y^O,  gives  <^(a,  0,  0), 
the  moment  array  of  the  character  A.  (26) 

5.  Linear  transformations  of  the  variates  are  effected  by  linear 
transformations  of  the  moment  symbols.  For,  using  dashed  letters  for 
the  new  measures  and  symbols, 

A"''B'^''C'''...^A''B'-'C'..., 

'      gX'a'  +  i/^'  +  s'y'  +  ...  —  gxa+y^  +  zy+...^ 

.-.   x'a  +y'(i'  +  z'y  +  ...  =  xa.  +  ijft  +  Zy  +  ... 

for  all  values  of  .r,  y,  z  .... 

Hence  if  the  transformation  is  to  be 

.t'  -CnX  +  Ciol/  +  ..., 

y'  =  C21X  +  C22y  +  ■■■, 


this  is  effected  by  the  substitution 

a-Cna  +  Coi^'  -^  •••, 

B=--Ci',a'  +  Coo/3'+  ..., 

: (27) 

Having  now  explained  the  meaning  of  the  symbols  and  tlie  purpose 
of  their  introduction  among  the  counts  or  frequencies  to  give  full  ex-^ 
pression  to  the  statistical  distributions  and  having  indicated  some  of 
the  more  elementary  properties  that  flow  from  their  definition,  we  wdl 
next  illustrate  their  use  by  applying  them  to  propositions  for  the  most 
part  of  a  famiUar  nature. 
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III.   BI^^OMIAL,  POISSON,  GAUSSIAN,  EXPONENTIAL 
AND  GAMMA  TYPE  FREQUENCIES 

1.  If  the  character  A  is  present  in  the  proportion  j!?  of  a  population 
and  absent  in  the  proportion  p'  =  1  —p,  the  popuhition  is  represented 
by  the  array 

p  +pA, 

and  the  number  of  --I's  found  in  samples  of  n  drawn  at  random  (with 

replacement)  will  have  the  frequency  array 

{l)+pAf,    (28) 

and  hence  by  (23)  the  moment  array 

/  a"         \" 

{p  +p)^^T^  (1  -^P"--^p-^^-^  ••■)  =  1  +  npa+  .... 

The  mean  is  therefore  np  and,  referred  to  the  mean,  the  array  is, 
by  (25), 

g-«pa  (y  +  pgayi  ^  (^p'g-pa  +  pgiy<^yi 

=  |1  +  ;^^   ^+PP   (P  -P)jr,^PP  {p--pP+P')j-^  +  ■■■]    •    --(SO) 

The  moments  of  the  binomial  (28)  about  the  mean  are,  therefore, 
the  coefficients  of  arj2\,  a^S!  ...  when  (29)  is  expanded  by  the  multi- 
nomial theorem.  For  instance,  the  second,  third  and  fourth  moments  are : 
npp,     npp  {p' -p)  and  npp  {p-  +  {'^n  —  A)p'p  +  pr).  ...(30) 

Since  moments  of  low  orders  only  are,  as  a  rule,  required,  the  ex- 
pansion of  such  expressions  as  (29)  presents  no  difficulties,  and  in  the 
sequel  it  will  often  be  deemed  sufficient  to  leave  the  moment  array  in 
the  unexpanded  form,  as  a  useful  formula,  from  which  moments  of  any 
given  order  may  readily  be  derived. 

2.  The  Poisson  limit  to  the  binomial  frequencies  is  reached  by 
supposing  the  character  A  to  be  very  scarce  in  the  jjopulation  and  the 
samples  taken  to  be  very  large  in  number.  If  the  mean  np  remains 
finite  and  its  value  is  denoted  by  m,  the  frequency  array  of  the  number 
of  J-'s  in  samples  of  n  becomes,  in  the  limit,  when  p>  ~0,  n=  cc , 

{p'+pAy  =  {l+p(A-l)\''^e"'^^-^^ (31) 

This,  expanded  in  powers  of  A,  gives  the  well-known  Poisson  fre- 
quencies, 

g~"'  (l  +mA  +—iA-+...j, 

and,  putting  e"  for  A,  gives   the  moment  array  e'"('^''-i)  about  the 
origin. 
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The  coefficient  of  a  is  m;  the  mean  is  therefore  m  and,  multiplying 
by  «"'"",  we  get  the  array,  referred  to  the  mean,  in  the  form 

gm(c«-l-a)  =  ^™(«'/2!  +  a'/3!  +  ...)^    /gg) 

whose  expansion  as  g"*"''/^-  x  ^''^o.-'/S.  ^  _  jg  easily  effected  by  ordinary 
multiplication,  to  any  required  number  of  terms,  each  term  giving  a 
moment  in  accordance  with  (23),  the  2nd,  3rd  ...  moments  being 

m,    m,    ni  +  3i7r,    m  +  10 nf,    m +  25nr +  I5m^*.     ...(32') 

3.  Poisson  frequencies  are  of  interest  in  connection  with  samples  of 
variable  number. 

If  the  population  has  frequencies  jy,  q  ...  in  the  separate  compart- 
ments of  any  classification,  it  is  arrrayed 

U=pA  +  qB  +  ..., 
and,  if  samples  of  flexed  number  n  are  drawn,  with  replacement,  they  are 

arrayed 

U''=^{2)A  +  qB+  ...y. 

The  numbers  in  any  two  compartments  A,  B  are  therefore  corre- 
lated; in  fact,  by  suppressing  the  remaining  symbols  (writing  them 
unity),  the  correlation  is  arrayed 

\\^p{A-\)  +  q{B-\)\\ 

If,  however,  'number  in  sample'  varies,  with  Poisson  frequencies, 
about  a  mean  m  and  has,  therefore,  the  array 

,pm(.S'-l) 

then,  by  (19),  by  putting  Z/for  8,  the  samples  have  array 

gm(t/-l)^ 
Qj,                                                            gm(p^+a£+...-l) 
or  gmpM-l)    gm9(iJ-l) (33) 

Thus  we  obtain  the  interesting  result  that  if  samples  are  drawn,  with 
replacement,  from  a  population,  enumerated  under  any  simple  or  com- 
plex headings,  with  Poisson  frequencies  about  a  mean  m,  the  numbers 
in  each  separate  cell  of  the  classification  vary,  in  complete  independence 
of  one  another,  with  Poisson  frequencies  about  their  individual  mean 
values  t. 

*  These  moments  are  calculated  by  'Student,'  Biometrika,  vol.  v.  (1906-7), 
p.  353. 

t  This  result  has  been  pointed  out  before.  See  a  reference  in  Pearson's  'On  the 
Theories  of  Multiple  and  Partial  Contingency,'  Biometrika,  vol.  xi.  (1915-17),  p.  149. 

By  means  of  it  a  simple  proof  of  Pearson's  expression  for  the  probability  that  a 
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4.  The  simple  Gaussian,  or  normal,  distribution  maj^  next,  for  the 
sake  of  further  illustration,  be  derived  as  a  second  limiting  case  of  the 
binomial  frequencies,  althoagh  the  multiple  Gaussian  will,  in  a  sub- 
sequent section,  be  derived  more  generally. 

To  obtain  the  transformation,  the  number  n  of  the  sample  is  again 
supposed  to  approach  infinite  values,  but  now  it  is  the  magnitude  of  the 
element  A  in  (28)  that  becomes  infinitesimal,  the  standard  deviation, 
which  by  (30)  has  symbol  A'^  "i^P',  remaining  finite. 

Let  us  then,  in  accordance  with  (15)  or  (25),  change  the  unit  from 
A  to  B,  writing  A'^>u¥  =  B'^,  or  J?ipp'  a  =  afS,  in  (29),  representing  the 
binomial  (28)  referred  to  its  mean.    This  is  now,  therefore, 
1  0-^(3-       1  p'  -p  o-^/S"      1  p''^~p'p  +/r  (T^(3* 
n    2       ?r-  ^/^     6        ?r  pj)  24 

or 

where  q  stands  for  (^'  -j^)!^  J  pp. 

given  set  of  deviations  from  expected  values  shall  arise  in  the  cell  numbers  in  a 
random  draw  of  ^1/  from  any  universal,  classed  in  n  cells,  is  reached. 

For  suppose  that  in  place  of  drawing  a  fixed  number  M  we  draw  variable  sample 
numbers,  with  Poisson  frequencies  about  a  iiieaii  number  M.  Then,  by  what  is  here 
shown,  the  deviation  e  in  any  cell  number  varies  independently  of  the  other  cell 
number  deviations,  with  Poisson  errors  about  the  cell  mean  ))i.  And  by  (32')  the 
standard  deviation  of  e  is  >//«.  Now  it  is  known  that  the  Poisson  frequencies  approxi- 
mate to  a  Gaussian  law  if  the  mean  m  is  not  too  small.  Hence,  with  this  proviso, 
the  error  of  the  cell  number  has  Gaussian  deviate  x  =  el  Jm  and  probability 

e'"^^'  dxlsj'lw. 
The  probability  of  the  set  of  n  deviations  is,  therefore, 

and,  if  this  is  expressed  as  an  element  in  n  dimensioned  space  with  tensor  x  and 
polar  angle  w,  it  is 

Of  these  extended  probabilities  we  require  those  only  (divided  by  their  integral) 
for  which  the  total  of  the  cell  numbers  is  M,  exactly,  or  for  which  Se  =  0  or 
'Z^Jm  .x  =  0.  This  plane  will  cut  the  n  dimensioned  spheres  in  a  series  of  n-l 
dimensioned  spheres  and,  if  w'  is  the  polar  angle  in  n-l  dimensioned  space,  the 
probability  of  the  draw  is 

and  the  probability  rank  of  the  draw  is 

where  x^'^'^e-lin.   [See  Phil.  Mag.  vol.  l.  (6th  series)  (1900),  p.  157.] 
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(34),  then,  shows  in  array  form  and  as  a  series  in  powers  of  1/Jnthe 
binomial  frequencies  referred  to  the  mean  as  origin  of  measurement 
and  referred  to  any  unit  of  measurement,  the  s.d.  of  the  distribution 
having  the  measure  o-  in  that  unit. 

Leaving  for  the  present  the  interpretation  of  the  succeeding  ap- 
proximations*, we  have  as  the  first  approximation  to  the  binomial 
frequencies,  when  n  is  large,  the  array 

Now,  the  resolution  fovf{,v)  of  the  identity 

or  of  g  I  cr2  (log  Bf^Tf  ix)  B'  dx, 

in  other  words,  the  expansion  of  a  function  in  poAvers  of  its  argument 
by  infinitesimal  increments  is  the  solution  of  an  integral  equation. 
In  the  present  instance  it  is  easily  seenf  that  the  solution  is 

1  1    X" 

f  (■^')  =  —I e~2V^^ 


iTTO" 


and,  accordingly,  that  the  limit  of  the  binomial  frequencies,  when  n  is 
infinite  and  the  step  A  is  infinitesimal,  is  the  normal  curve  of  error. 

5.  It  follows  from  the  last  paragraph  that  the  Gaussian  is  very 

simply  expressed  as  an  array  and  this  simplicity  is  a  feature  generally 

characterising  the  method  of  description  employing  symbols. 

The  expression  i^-V  /o-\ 

^^  K^'^) 

completely  describes  the  normal  distribution  and  not  only  gives  a  ready 

means  of  obtaining  the  moments,  by  expansion  in  powers  of  a,  but  is 

an  ensemble  that  may  be  handled  as  a  unit,  in  any  process,  for  the 

derivation  of  further  distributions. 

To  take  an  instance,  samples  of  n  drawn  from  (35),  by  (16),  have 

their  totals  arrayed  {e^-'^'°-''f',  and  their  means  (by  the  substitution 

1    ffS 

noi  =  P)  arrayed  g2  ?«  ,  The  means,  therefore,  of  samples  of  n  drawn 
from  an  infinite  normal  universal,  have  normal  distribution  also,  the 
S.D.  being  reduced  to  IjJnih  of  the  original  value. 

6.  In  precisely  the  same  way  the  correlated  Gaussian  may  be  derived 
as  the  limit  reached  when  samples  of  very  large  number  n  are  drawn 

*  See  p.  24  seq.  t  Or  it  will  follow  from  (105),  p.  46. 

S.  2 
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a 

b 

c 

d 

from  a  population  in  whicli  are  distributed,  but  not  independently,  two 
unit  characters,  B  and  C\  and  the  total  B  and  total  C  are  observed  in 
each  sample. 

If  the  frequencies  of  B  and  woi-B  are  p  and  p'  B 

and  of  C  and  not-C  are  q  and  q  and  a,h,c,d  are  the 
frequencies  of  the  conjunctions  woi-B  wot-C,  B  not-C 
etc.,  as  shown  in  the  figure,  the  frequenc}^  arraj^  oi    ^       <"  d      q 

the  universal  is  p'        p       1 

a  +  hB^cC-vdBC, 

or  1  +i>  {B-l)+q{C-\)  +  d{B-  1)(C-  1), 

or  i+_?5(/3  +  i/3-  +  ...)  +  ^7(y  +  2y'+  •■•)  +  <^(^y +•■•). 

and,  referred  to  the  means  ^j  and  q  by  multiplication  by  e~^^^,  er'^^,  this 
becomes 

1 1-  ^2^p'l^"  +  I Q'/y'  +  (^  "i^?)  /^y  +  higher  orders (36) 

If  this  is  raised  to  the  power  n,  it  will  give  us  the  moment  array  of 
the  totals  in  samples  of  n. 

The  s.D.'s  are  therefore  Jnpp,  Jnqq'  and  changing  the  units,  by 
writing  Jnpp  /3  =  o-il3',  J nqq' y  -  a-.^y' ,  to  show  arrays  with  S.D.'s  equal 
to  0-1  and  o-o,  the  array  of  samples  of  n  is 


1  '  ,  ,        1 

1  -f  -  {i  erf  ;S"-  +  h  a-o'y'  +  ra-^o'2  fi'l']  +  7^  {cubesj  +  . . 


.(37) 


where  r  =  (d  -pq)l  'J'pp'qq. 

In  the  limit,  therefore,  when  n  is  very  large,  the  array  is 

g  \  o-i"/3'^  + 1 0-2-7'"  +  '-o-iffo/SS' ^3g^ 

This  is  readily  seen  to  be  the  value  of 

/■■"    j"'"  dxdy  -i77T^-^!^.-2?- — +-^,1     o, 

f ;^  p.      2(1-;-)  (<ri-  a^c,      <t^' \  e^'^  e^y , 

J -co  J  -co  27r(ri(To  vl  —  r' 

and  the  same  moment  array  (38)  is  therefore  reached  whether  as  the 
limit  of  the  totals  in  large  samples  drawn  from  a  universal  having  a  con- 
tingent relation  between  two  unit  characters  discerned  in  the  members, 
or  as  the  array  expression  of  the  Gaussian. 

It  will  be  observed  that,  when  samj^les  are  drawn  from  such  a  uni- 
versal, the  correlation  coefficient  remains  the  same,  whatever  the  numbers 
drawn,  and  is  equal  to  the  correlation  coefficient  of  the  original  popula- 
tion, vi~.  (ad-bc)ljpp'qq'. 
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7.  A  more  general  derivation  will  be  sought  for  the  multiple  Gaussian, 
than  by  regarding  it  as  the  limit  of  contingent  binomial  distributions, 
but  its  array  expression  can  be  indicated  here,  and  is  clearly 

g^a-^^a^-  +  ^(X.fa.f  +  ...  +  Vi^o-jO-oaia.-,  +  ...  ,on\ 

extended  to  any  number  of  variates;  o-j,  o-^ ...  being  the  variabilities  of 
the  variates  and  rjo  •  •  •  the  coefficients  of  correlation  between  them. 

For  in  the  first  pl^ce  the  form,  that  is,  the  quadratic  index,  is  a 
necessary  consequence  of  the  integration  of  an  expression  such  as 

/         /         ...  ke~  (quadratic  in  x„  ,r,, ...)  gx^ai+x.^a.,-^-  g^^^^  (/^-j.., . . .  ^ 

./  -'x>   J  —00 

whilst  in  the  second  place  the  constant  multipliers  in  the  index  must 
be  such  as  to  give  the  second  and  product  moments  as  coefficients  of 
the  symbols  when  (39)  is  expanded  as  a  moment  array. 

Hence  (39)  represents  the  general  correlated  Gaussian  frequencies 
referred  to  the  means.  If  expanded  in  integral  powers  of  a^,  a. . . .  it  gives 
as  the  coefficient  of  a/a/ . . . /r !  si...  the r,  s...  moment.  If  ai=log  Ai  etc. 
and  it  is  expanded  in  powers  of  A^,  Ao  •••,  by  infinitesimal  increments, 
from  —  00  to  00,  it  gives  as  the  coefficient  of  Ai^\  A.2^'  .••  the  normal 
frequencies. 

If  the  means,  instead  of  being  at  the  origin,  are  at  ?»i,  ni.. ...,  the 
general  Gaussian,  by  (25),  has  the  array 

^mjai  +  m2a2+  ...  +  ^ffj^a^'^  +  ^<rj'a.2^  +  ...  +i\.-,(Xia:M-^a.2+  ...    (40) 

8.  We  may  now  enquire  under  what  conditions  the  general  Gaussian, 
as  represented  in  array  form  in  (40),  will  be  the  law  of  distribution  of 
the  observed  effects,  when  these  are  brought  about  as  the  result  of  an 
infinity  of  small  independent  contributory  causes. 

If  ^1,  ^-lo ...  are  symbols  in  designation  of  the  observed  effects,  and 
fs{Ai,  A2 ...)  array  the  frequencies  of  the  amounts  of  A^,  A.2 ...  con- 
tributed by  one,  the  .9th,  cause,  the  joint  contributions  of  all  the  causes 

are,  by  (17),  arrayed 

U/s(A„A.2...). 

s 

Let  the  expansion  of/.(^i,  ^4., ...)  in  moment  symbols  be 
/s  (e'^S  e«2 . . . )  =  1  +  .;72jai  +  Jiua^  +  ...  +  h o-^W  +  io-2"«j"  +  •  •  • 

+  ri2  0-iO-.aia.+  ...  +  |/3o-iV  +  -^-/3'o-i-o-„ai-ao+  ..., 

so  that  nil,  m....  are  the  means,  o-f,  o-./  ...rjoo-io-o  ...  the  second  and 
product  moments  from  the  means,  I3(t{\  ySViV.. ...  the  liigher  moments 
of  the  distribution  of  effects  peculiar  to  this  contributor. 

2—2 
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Then,  if  the  individual  means  w?i,  inu...  and  standard  deviations 
o-j,  0-2---  are  infinitesimal  and  the  number  of  contributors  is  infinite, 
but  so  that  the  sums  [»2i],  [w2o]  •••  V^\\  \f-^\  •••  ^^e  ^"it^'  "^  t^®  ^i^^^^' 

TLfAAx    A-^  ...)=g['«i]«i  +  [»'2]«2+---+5[<-]ai"+---  +  [''i2<^i<^i]aia2+---^ 

the  general  Gaussian. 

The  assumption  that  has  been  made  in  drawing  the  above  conclusion 
is  that  the  higher  moment  ratios  /8,  /3' . . .,  obtained  by  dividing  the  higher 
moments  of  each  contributory  distribution  of  fortuities  by  the  corre- 
sponding powers  of  their  standard  deviations,  are  all  finite.  This,  at  any 
rate,  appears  to  be  a  sufficient  condition,  in  addition  to  the  conditions 
named  above,  that  may  be  supposed  to  govern  the  fortuities  of  the  inde- 
pendent constituents,  to  ensure  a  resulting  Gaussian  law  of  distribution. 

9.  AYe  have  shown  how  the  array  expression  for  tlie  Gaussian  distri- 
bution may  be  obtained  by  integration  of  the  frequency  curves  and 
surfaces.  Sometimes  the  reverse  process  is  desirable  and,  in  illustration, 
we  will  derive  some  properties  of  the  tri-variate  Gaussian  and  solve  a 
problem  in  selection  by  expanding  (in  part)  the  frequency  array. 

The  expansion  of  the  tri-variate  Gaussian,  referred  to  means  as  origin 
and  s.D.'s  as  units,  Avith  respect  to  variate  ^j,  is  performed  as  follows : 

=   (h  («!  +  '•l2a2  +  '-IS^S)"  X  g*  (1  -  '"12')  ^^2"  +  2  (1  -  ''13')  0-i    +  ('"23  "  »'l2''l3)  °-P-Z 


r  ^^=i^(''i  +  '-i2''2  +  '-i3«3)^i(;,rjX  the  same (41) 


\/27r 

X  /i2^i'^-2  +  rn^iH  +  4  (1  -  '•12') a..f  +  h(l-  )\s-)  a^-  +  (r.,.,  -  j-jj-jg)  um^^ 

and,  in  this  form,  the  distribution  is  shown  resolved  into  a  normal 
frequency  array  of  cCiS  and  each  a\  array  distributed  as  a  partial  normal 
array  of  x^,  x^  with  the  well-known  means  ri2^i,  ?'i3.i'i,  standard  devia- 
tions Vl  -^12^  Vl  -^is"  and  correlation  coefficient 

Vl  -  i\i  vr^^  ■ 

Problems  in  selection  or  'changing  the  frequencies'  may  be  worked 
out  by  expanding  the  array  with  respect  to  the  frequencies  to  be  changed 
and,  after  the  change  or  selection  has  been  made,  reintegrating  into 
array  form,  the  remaining  variates  being  retained  in  this  form  throughout. 
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Thus,  suppose  that  the  variate  A^  in  the  above  distribution  is 
'selected'  to  show  a  normal  distribution,  with  mean  h  and  s.d.  s,  in 
place  of  a  mean  0  and  s.d.  unity.  The  distribution  has  been  expanded 
with  respect  to  this  variate  in  (41)  and,  now,  modifying  the  frequencies 
in  accordance  with  the  desired  selection,  (41)  becomes 


/. 


1  /         n" 


X  g  4  (1  -  '•]2-)  o.i  +  i  (1  -  )-i3")  ots"  +  (r.,3  -  i\^\.^a.^^ 

=  g  'itti  +  n-Mi  +  r^Mi  +  h s'W  +  3  i  1  -  »'i2-  (1  -  '^- ) }  «2^  +  i  !  1  -  ''13"  (1  -  «■) '  "3' 

+  {»■!>:) -'•i2'V,(l-s-)}a,a.j, 
showing  that  A.2,  A^  have  means,  s.D.'s  and  correlation  coefficients  *■ 
changed  from  0,  0,  1,  1,  1^3  to 

{r.3  -  n,n,  (1  -  s')]l\ll-ryni-s')  ^l-r,i{\-sT- 

10.  Gaussian  derivatives.  It  has  been  shown  by  Pearson  that  the 
quadrature  of  the  correlated  Gaussian,  between  given  w  and  y  planes, 
may  be  effected  by  expanding  the  frequencies  in  powers  of  the  corre- 
lation coefficient  r. 

This  expansion  is  readily  arrived  at  when  symbols  are  used  and  an 
interesting  relation  is  shown. 

AYhen  the  two  characters  A,  B  are  referred  to  their  means  as  origin 
and  standard  deviations  as  units  of  measurement  the  normal  array,  by 
(38),  is 

and  its  expansion  in  powers  of  r  is 

=  2  afe^^.f^^e^^."^,     (42) 

s  =  0  Si 

showing  the  coefficients  as  the  products  of  two  factors,  one  an  array  in 
A  and  the  other  in  B. 
The  array, 

is  of  interest.  Clearly  the  total  and  first  s  -  1  moments  are  zero,  since 
the  expansion  begins  with  the  term  a";  and  negative  frequencies  are 
*  In  his  paper  'On  the  Influence  of  Natural  Selection  on  the  Variability  and 
Correlation  of  Organs,'  Phil.  Trans,  vol.  200a  (1903),  Pearson  develops  the  general 
formulae  and  on  p.  25  gives  this  case  as  an  instance. 


a'e- 


22  BINOMIAL,  POISSON,  GAUSSIAN, 

therefore  involved.  The  integral  equation  is  easily  solved  and  the 
resolution  into  frequencies  is 

£(-0(7i>"'^-^ '«) 

as  will  be  seen  by  performing  s  integrations  by  parts  and  finally  using 
III,  4. 

The  array  a^e^"'  therefore  represents  the  st\\  derivative  of  the  normal 
curve  of  error  (with  sign  adjustment)  and  is  the  function  considered  by 
Thiele*  who  tables  the  polynomials  and  their  zeros. 

It  follows  then  that  the  identity,  (42),  between  integrated  arrays 
has,  as  its  expanded  equivalent, 

where  z  = p=i — e    2(i-H)^      -^  -", 

27r  V  1  —  9" 

z,r-=—=e    2*    and   Zy  =  -=e    ^2/ 
V27r  '      \/27r 

and  it  follows  that 

dV        /    dV       r' 


,=0  \    d.vJ  "■"  *  V    d^jJ  ""  '  s 


.(44) 


and  that 


0  \  d.i;/  '^'" '  \  dyJ  "  '  s 
The  coefficients  of  r"  in  the  expression  of  the  integral,  from  given  ,r,  y 
planes,  of  the  correlated  Gaussian,  as  a  series  of  powers  of  the  corre- 
lation coefficient  r,  were  found  by  Pearson  and  tabulated  by  Everitt  J, 
under  the  name  of  Tetrachoric  Functions,  which  are  thus  seen  to  be 
related  with  Thiele's  Gaussian  derivatives. 

The  extension  of  the  series  expression  for  the  ordinate  to  Gaussians 
with  more  than  two  variates  is  easily  made  and  the  expansion  of  the 
tri-variate  Gaussian  array, 

*  Ahnindellg  lagttagelseslaere:  Sandsyjiligheds  regning  og  Mindste  Kvadraters 
Methode,  T.  N.  Thiele,  1889.    Thiele  omits  the  IjJ^^. 

t  In  the  first  term,  -^Zj.  will  stand  for   /       z^dx. 

X  PhlJ.  Trans.  A,  vol.  195  (1903),  p.  1  and  Biometrika,  vol.  vii,  1910.    Here 

§  The  sign  "L  abbreviates  '  to  the  power  of.' 
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in  powers  of  7\.2,  ry^,  r^s  leads  to  the  result 

«    CO    CO     .    _d_\i'.i  ^     f_^y^\     f     dV^\    rjrjr.j' 


i=oj=i)k=o\    dxj      "  '\    dxj      "'-'X   dxj     ''^"  i\  jl   kl' 

(45) 

where  z  now  stands  for  the  ordinate  of  the  general  tri-variate  Gaussian 
distribution  referred  to  the  means  as  origin  and  standard  deviations  as 
units  of  measurement  and  Zi,  z^,  z^  as  before  stand  for  the  ordinates  of 

1  _1       2 

the  simple  Gaussian  totals,  —i=e    •■^*^i  etc. 

V27r 

11.  From  the  expansion  of  the  ordinate  of  the  general  Gaussian  in 
powers  of  the  correlation  coefticients,  as  shown  in  (44)  or  (45),  not  only 
may  the  integral  of  the  frequencies,  between  given  variate  limits,  be 
found,  but  the  successive  integrals  or  moments*  for  such  blocks  of  the 
solid  may  also,  from  the  form  of  the  resolution,  be  readily  calculated. 

The  calculation  of  such  a  part-coefficient  as  I    .r''  \ir,)  ^'«^'^  ^^^^^ 

be  performed,  either  by  obtaining  the  polynomial  multiplier  of  z^  in  the 
integrand  and  using  Dr  Lee's  table  of  the  values  of  the  incomplete 
normal  moment  function!,  or  by  integrating  by  parts,  when  there  results 
a  series  of  tetrachoric  functions  multiplied  by  powers  of  x,  followed 
(\i  p  %  s)  by  a  single  moment  function  of  order  }}  -  s. 

12.  Moreover  the  expansion  (44)  may  be  applied  to  extending 
Pearson's  equation  for  r  J  from  tetrachoric  to  polychoric  groupings.  If 
the  classification  gives  more  than  a  simple  dichotomy  of  each  of  the 
two  variates  the  problem  becomes  one  of  probability.  Assuming  a  normal 
distribution  of  the  marginal  totals,  without  errors,  and  thence  a  know- 
ledge of  the  values  of  the  class  divisions  on  the  normal  scale,  no  assumed 
value  of  r  will,  in  general,  give  all  the  observed  cell  numbers. 

We  will  find  the  probability  of  the  observed  numbers  for  any  assumed 
r  on  the  supposition  that  the  errors  are  due  to  random  sampling  and, 
by  making  this  a  maximum,  obtain  an  equation  for  the  most  putableg  r. 

*  We  may  call  to  mind  that  the  nth  moment  about  one  end  of  a  freciuency 
distribution  divided  by  u!  is  equal  (by  integrating  n  times  by  parts)  to  the  H  +  lth 

integral  from  the  other  end. 

t  Tables  for  Statisticians  and  Biometricians,  Table  m.  J  Loc.  cit. 

§  There  is  no  most-probable  r,  since  the  sampling  is  from  an  unknown  universal; 
and,  for  the  same  reason,  there  are  no  probable  errors  to  the  calculated  value  of  r. 
Theoretically  we  could  calculate  the  putable  errors,  defining  the  range  ot  putable 
error  as  that  within  which  the  universal  must  lie  if  the  probability  rank  ot  the 
sample  is  to  be  greater  than  i.    If  more  than  one  constant  is  in  question  we  could 
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~  *'^"  d,r  and          — =.  e " ^^  dy,  the 

j-c»\/27r 


Let  u,  V  stand  for    I         --—  ..         v> ,        ,- 

integral  frequencies,  or  ranks,  of  the  probabiKty  curve ;  and  let  dashed 
letters  indicate  derivatives,  so  that  u',  v  are  the  ordinates  and  -  u",  -  v" 
the  first  (neg.)  derivatives  etc.  Denote  by  Bu,  8v  the  difference  of  rank 
of  adjacent  marginal  divisions  and  by  ow 
the  integral  frequency  of  the  enclosed  cell. 
Then,  by  integrating  (44)  between 
divisional  limits, 


Sw  =  huSv 


?)u'Sv' .  r  +  Su"Sv"  "7  + 


.(46) 


1 

8u 

^v  I 

8w 

Now,  if  ti  is  the  observed  number  in  the  cell,  the  probability  of  the 
sample  for  the  given  classification  is,  by  (6), 

^) ''<«''■)". 

taken  for  all  cells. 

To  make  this  a  maximum  requires  n(S»-)"  a  maximum,  or  '^nlog{8iv) 
a  maximum. 

The  value  of  r  is  therefore  given  by 

d 


or 


n  (  Su'ov 


2  =0, 

8w 

+  ?>u"8v"  .  r  +  8ii'"Sc"'  —  + 


8i(8v  +  Sk'Sv'  .  r  +  8it"8v' 


.(47) 


13.  Returning  now  to  (34),  expressing  the  binomial  frequencies 
referred  to  the  mean  as  origin  and  a  unit  of  measurement  making  the 
standard  deviation  o-  and  shown  as  a  series  in  powers  of  l/\/w,  it  is  seen 
that  the  array  coefficients  are  Thiele  derivatives  and  that  the  binomial 
is  equimomental,  therefore,  with  the  frequency  series 


1  + 


q  0-' 


/      ^  V       1   f  /  o      1 ,  crV      d\'        .<t''/      d  Y] 


where  g  =  {p  -  p)!'!  '^pp. 

calculate  the  contour  of  values  having  the  same  i)roperty.     Practically  the  work 
would  be  prohibitive. 
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14.  The  Gaussian  and  Gaussian  derivatives  are  curves  with  infinite 
limits  in  both  directions  and  a  series  of  such  curves  can  be  fitted  by- 
moments  to  a  distribution  of  frequencies  having  a  like  characteristic. 

Let  <^  (a)  express  the  given  distribution  in  moment  array.  Multiply 
by  the  expansion  ofe^^"-'  and  let  the  result  be  the  series  1  -j- Ci a + c.a^  + . . . . 
Therefore 

<k{a)  =  {l+c,a  +  C.a-+...)ei°-' (48) 

and  so  the  required  resolution  is  effected,  in  arrays. 

If  3/=/('^')  is  the  required  frequency  curve,  by  (43),  the  resolution 
in  frequencies  is 


1^.2 


/(.)=={i«.(-|).o.(-0....}^- m 

If  the  mean  of  the  given  distribution  is  m  and  standard  deviation  o-, 
these  may  conveniently  be  taken  as  parameters  of  the  arbitrary  Gaussian, 
when  the  resolution  to  be  performed  is  expressed,  in  arrays, 

<k{a)  =  {l+Csa'  +  C,a'+...)e'^''  +  i''''''      (50) 

and,  the  c's  being  determined  by  the  same  expedient  as  described  above, 
the  resolution  in  frequencies  is,  by  an  obvious  extension  of  (43), 

If  a  limited  number,  only,  of  the  moments  of  the  distribution  are 
given,  for  instance,  if  (f>  (a)  is  an  empirical  series,  the  c's  are  calculated 
to  the  same  order  and  the  fitted  curves  are  equimomental  to  this  order. 

If  the  distribution  to  be  resolved  have  not  infinite  limits  in  both 
directions,  other  curves  may  with  advantage  take  the  place  of  the 
Gaussian,  the  treatment,  by  derivatives,  being  similar  to  what  precedes. 
We  will  next  consider  curves  suited  to  describe  statistical  frequencies 
extending  to  infinity  in  one  direction,  only,  and  their  arrays. 

15.  Gamma  type  distributions  and  their  derivatives.  The  frequencies 
having  logarithmic  decrement  are  expressed  by  the  exponential  curve 

y^e-\    (52) 

of  which  the  array  is 

r  g-x^ax^^^_}_ (53) 

Jo  ^-o- 

If  random  samples  of  })  are  taken,  the  total  of  the  character  ,r  in 

them,  by  (16),  has  the  array 

1 
(l-a)'^' 


26  SAMPLING  A  LIMITED  POPULATION. 

but  ^_^=["'^^;:^%-fZ.r  (54) 

(1-o.y     Jo      T{p) 

and  so  the  Gamma  curves 

'^~    Tip) 
are  derived  as  the  result  of  sampling  tlie  exponential  freciuencies  (52). 
As  in  the  last  section  we  have,  now  provided  s<p, 

(T^'=L  (-a)  (t-«  j-""*^- <»»> 

and  hence  that  the  left-hand  member  is  the  moment  array  of  the 
frequencies 

'^=[-d.)^W ^  ^ 

If,  then,  ^(a)  array  the  moments,  up  to  the  sth  moment,  of  any 
frequency  distribution  having  range  0  to  »  and  if  we  multiply  by  the 
expansion  of  (1-a)'',  where  ])  is  any  value  >s,  and  obtain  the  series 
1  +  Cia  +  c.2a-+  ...,  it  follows  that,  to  the  sth  order  of  moments, 

ct>{a)  =  (l+C,a  +  c,a:-+...)j{l-a)P,   (57) 

thus  resolving  the  array  into  a  series  of  Gamma  derivatives. 

If  y=/(,r)  is  the  required  frequency  curve,  the  resolution  is  by  (55) 

/«M-<-J^.)-^(4.y— (-0}^^r  ••■(-> 

The  resolution  of  frequency  distributions  into  Gaussian  and  Gamma 
integrals  and  their  derivatives  presents  an  advantage  for  purposes  of 
quadrature  since  (49),  (51),  (58)  are  clearly  integrable  with  the  help  of 
existing  tables*,  the  first  term  only  requiring  the  tables  of  integrals. 


IV.  SAMPLING  WITHOUT  REPLACEMENT  AND  PARTITION- 
ING A  LIMITED  POPULATION.  HYPERGEOMETRIC  AND 
KINDRED  FREQUENCIES 

1.  We  will  next  enquire  how  far  the  symbols  of  denomination  may 
be  brought  into  use  to  simulate  the  fortuities  arising  when  units  and 
groups  are  taken  and  not  replaced. 

Clearly,  when  samples  are  drawn  en  bloc  from  a  limited  population, 
the  supposition  made  in  (16)  that  each  successive  unit  contributing  to 

*  Tables  of  the  incomplete  Gamma  function  are  now  at  press. 
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a  sample  is  drawn  from  an  unchanging  population  U  is  no  longer  tenable 
and  U^  will  not  correctly  array  the  samples  of  s.  A  different  algebraical 
analogue  is  required,  which  may  be  made  quite  general,  and  applicable 
to  multiple  sampling,  as  follows. 

2.  Let  U  comprise  /  units  A ,  m  units  B  etc.  (/+/;?+...=  ??),  so  that 
in  product  symbol 

and  let  the  sampling  consist  in  partitioning  U  into  samples  of  number 
<?i ,  So ...  {si  +  So+  ...—  n)  and  of  designation  aS'i ,  S^  — 

Then  any  unit  as  A  may  fall  into  any  sample  and,  if  the  appropriate 
suffix  designate  its  destination,   the  alternative  fortuities  of  A   are 

arrayed 

At,St,  +  A.&.+  .... 

Hence  all  the  possible  partitions  of  U  are  arrayed 

{A,S,  +  A,S,+  ...y  (B,S,  +  B.,S,+  ...)'"...,  (59) 

any  single  term  such  as  Ai''' Bj^^ ...  S^'k  AJ'^B.]^'^ ...  S-f-...  signifying  a 
particular  partition  of  U  into  samples  of  s^,  So---  in  number,  viz.,  that 
partition  in  which  sample  1  has  (h  A's,  h  B's...  and  sample  2  has 
rto  A's,  b.2  B's,...  and  so  on. 

We  require  to  extract  from  (59)  all  the  terms  in  Si'^R/-^...,  their 
number  (by  suppressing  the  symbols  .4,  B...)  being  coefficient  of 
>SV'i>SV"-...  in  {S,  +  S,+  ...f  or  nljs.ls.l...  say  "C^^, «.,...- 

If,  therefore,  from  U^A'B"'...  is  taken  a  first  sample  of  Su  a  second 
sample  of  80  and  so  on,  the  last  sample  being  the  unselected  remainder, 
the  frequency  array  of  the  constitutions  of  the  samples  is  expressed  as 
the  partial  array, 

Array  S/^R/-^... 

■.{A,S,  +  A,S,+  ...y{B^S,+B,K+...y\..l"Cs^,s.,...---m 

3.  It  will  be  observed  that  (59),  with  the  aS's  omitted,  merely  arrays 
all  the  possible  fortuities  of  the  units  by  ascribing  to  their  objective 
symbols  the  distributive  law  of  common  algebra.  The  symbols  S, ,S.,... 
are  introduced  to  play  the  part  of  counters  of  sample  numbers  and 
enable  the  cases  containing  s,  in  sample  1,  s,  in  sample  2  etc.  to  be 
segregated,  as  is  done  in  (60). 

We  have  supposed  the  universal  to  be  made  up  of  separate  kmds 
A,  B  ...  but,  it  will  be  clear,  that  these  symbols  may  stand  for  complex 
characters.  For  example,  if  A=P'Q'B',  B  =  PQ'R'...,  H-FQB,  in- 
dicating that  three  characters  and  their  negatives  are  discerned  in  each 
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individual,  then  the  result  will  give  the  distributions  of  total-P,  total- ^, 
total-^  in  the  samples.  It  is  unnecessary  however  to  elaborate  special 
formulae  since,  by  following  the  elementary  principle  just  noticed,  the 
frequency  arrays  for  particular  classes  of  problem  may  usually  be  written 
down  without  great  difficulty. 

When  the  array  has  been  expressed  in  symbols,  it  can  be  analysed, 
or  combined,  and,  in  particular,  its  moments  obtained  by  the  methods 
already  explained.    Four  examples  follow. 

4.  As  an  elementary  instance  we  will  first  find,  by  the  present 
methods,  the  moment  array  of  the  hypergeometric  frequencies,  viz.,  the 
frequencies  of  A  in  samples  of  s  drawn,  without  replacement,  from  a 
population  of  n  units  of  wdiom  /  are  A"'. 

Suppressing  the  unnecessary  not-J.  and  not-sample  symbols,  the 
frequency  array  is,  by  (60), 

Array  S' :{l  + SY-'(l  + ASy/"C,      (61) 

and  the  moment  array  is  therefore 

Array  S' :{1  +  Sy'-'{l+e'^Syi"Cs,     (61') 

Array  S' :  (1  +  >S')"-'  |(1  +  >S')  +  («  +  ^  +  . . .)  S^' /^'C, 

=  l  +  -     a+-  +  ...     +    \        ;  -      a  +  -  +  ...      + (62 

n  \       2!         /         2\n{n~\)      \       2!  / 

The  first  three  moments  about  the  origin  are  therefore 

U    Is  ^  l{l-\)s{s-l)     Is  ^     I(l-l)s(s-l)  ^  l{l-l)(l-2)s(s-l)is-2) 
w'  n         n{n-l)      '  n  n{n-l)  n{n-l){n-2)  ' 

and  the  rt\i  moment  about  the  origin  may  be  expressed  as 

Is     lil-l)s(s-l)  l(l~l)(l-2)s{s-l)(s-2)^ 

'■     n        2ln{n-l)         '  Sln{n-l){n-2)  ''     ■"' 


where  '£',.  is  defined  by 


.(62') 


(e^-iy  =  ^^Er-,, 

r  =  l  ^  • 


and  the  series  terminates  with  the  rth,  /th  or  *^th  term,  whichever  is 
least  (usually  r). 

*  The  moments  of  the  hj'pergeometrical  frequencies  are  worked  out  by  Pearson 
iu  his  paper  'On  Certain  Properties  of  the  Hypergeometric  Series,'  Phil.  Mag. 
vol.  L.  (5th  series)  (1900),  p.  157. 
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5.  As  a  second  illustration  we  may  take  the  double  liypergeometric* 
and  find  the  frequency  and  moment  arrays  of  the  correlated  numbers  of 
A  found  in  two  separate  samples  of  s,  and  .*o  drawn  from  a  population 
of  11,  of  whom  /  are  A,  without  replacement. 

By  what  precedes,  the  frequency  array  will  be,  on  suppressing  the 
unrequired  symbols, 

Array  ^S.'uS,'-^ :  (1  +  S,  +  >S;)"-'  (1  +  A^^S,  +  A,S,y/'^C,^^ .,  ■••(63) 
and  the  moment  array 
Array  S.'^S.,'"- :  (1  +  ;S'i  +  S.f-'  (1  +  c^"i>S'i  +  e''^-S,y/''Cs^^  s„- 

Array  S.'^SJ-^ :  (1  +  S,  +  S,)--'  |(1  +  ^S'^  +  R)  +  («i  +  ^'  +  •••)  S, 


+  (a,+  ^j+  ...  )/S'  ' 


=  l-^ft^-^--)-|^(^.-^-- 


'^'/'%,,  s. 


n[l\\'     2!     ■■■/      1!V'     2 

1(1-1)     (Si(Si-l)/  af  \-       S,   S.   /  a,"  \f  a.?  \ 

.  s,{s,-l) 


2!    v"^^^^-;r'^' ^^'^ 

The  power  moments,  ««,.  ,  ???;._,  about  the  origin,  are  the  coefficients 

of  -^ ,  -^  and  so  are  given  by  the  initial  and  final  terms  of  each  chain 

bracket  of  (64)  and  are  as  given  in  (62)  above.    The  product  moments, 

Mj.    ,.^,  are  the  coefiicients  of  the  products  -^  -^  and  so  are  given  by 

the  intermediate  terms  and 

_  1(1-1)    s,  s,  ,    /(/-l)(/-2)    (s,(s,-l)  s,  , 

'''^i^'-"-~n(n-l)llll     n{n-l)(7i-2)\      2!       1!      ^^     ''^ 

the  series  terminating  usually  in  virtue  of  'B,  vanishing  when  i  >  r. 
Here  (n  -)*''  =  n{n-l)...  to  i  factors. 

*  The  moments  of  the  double  hypergeometrical  series  of  frequencies  are  investi- 
gated by  Isserlis  in  his  paper  '  The  Application  of  the  Solid  Hypergeometrical  Series 
to  Frequency  Distributions  in  Space,'  Phil.  Mag.  vol.  xxviii.  (1914),  p.  379. 
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6.  As  a  third  example  let  a  population  of  w  individuals  be  sampled, 

amongst  whom  two  characters  B,  C  are  distributed      B^ 

in  such  wise  that  a  possess  neither  B  nor  C,  h  possess 
B  only,  c  C  only  and  d  both  B  and  C.    The  samples  ^ 
of  s  are  therefore  expressed  as 

Array  ;S'^  :  (1  +  .S')"  (1  +  BSf  (1  +  (7>S')^  (1  +  BCSfl^C, 
and  the  moments  given  by  the  development  of 
Array  8'  :  (1  +  Sf  {\  +  ^^S')"  (1  +  eySfil  +  e^+ySfj^'Cs 

:  (1  +.ST  {(1  +  S)  +  {l^+§.^  +  -)^y{0  +'S')4-(y  +  |;+  ...)  SV 

X  |(i  +^')  +  (^7+^-+^\...)^]7»C'. (6(3) 

Thus,  referred  to  natural  origin,  the  numbers  of  B  and  C  in  the 

samples  of  s,  drawn  without  replacement,  have  for  their  moments 

mean  (B)  =  {b  +  d) .  s/n, 

mean  (C)  =  {c  +  d) .  s/n, 

2nd  moment  (B)  =  (b  +  d) .  s/n  +  {b  +  d)  {b  +  d-1)  .s{s-  l)/n  {n  -  1), 

2nd  moment  (C)  =  (c  +  d) .  s/'n  +  {c  +  d)  (c  +  d  -  1) .  s  {s  —  1  )ln  {n  -  1), 

product  moment  =  d.  s/n  +  {{b  +  d)  (c  +  d)  -  d} .  s  {s  -  \)/n  (n  —  1), 

(67) 

and  higher  orders,  for  a  time,  not  difficult  to  extract. 

7.  In  the  three  cases  taken  above  either  one  or  two  'alternative' 
characters  are  discerned  in  the  individuals  composing  the  limited  popu- 
lation from  which  samples  are  taken.  As  a  last  illustration  we  will 
suppose  measured  characters,  in  any  number,  distributed  with  normal  * 
or  other  frequencies,  in  the  w  members  and  obtain  both  as  a  partial  and 
separate  moment  array  the  distribution  of  totals  of  the  several  characters, 
when  random  samples  of  s  are  withdrawn  and  replaced  en  bloc,  or  when 
the  whole  population  is  randomly  partitioned  into  groups  of  given 
numbers. 

If  A,  B...  are  the  character  symbols,  an  individual  of  the  population 
possessing  measures  a-,  y  ...  in  these  characters  has  the  symbol 

/,  or  A'B"...,  or  e^''+-"^+-,  or  e',  say, 
where  t,  standing  for  a'a  +  q/j3  +  ...,  shows  his  features  expressed  as  a 
sum  array. 

*  No  limited  population  of  number  n  can  truthfully  be  said  to  be  distributed 
normally.  Actually  we  assume  the  population  to  have  all  the  Gaussian  moments 
and  surmise  that,  if  n  moments  so  agree,  the  result  will  not  be  much  in  error,  lor 
our  purpose,  by  the  disagreement  of  the  remainder. 
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The  array  of  the  population  is,  therefore, 

-^A-B"...,  or  y^^e\  or  say  </>(«,  /3...) (68i) 

Now  in  accordance  with  the  methods  of  this  chapter,  if  E  stands 
for  'rejected'  and  S  for  'taken  in  sample'  the  totals  of  the  characters 
in  samples  of  s  are  given  as 

Array  E"-'S'  ■.U{R  +  //S')/"^., 
n  repeating  for  the  different  values  of/,  or  e™+2'^+--j  n  in  number. 

We  are  given  the  moments  of  .r,  y  ...,  viz.  the  coefticients  of  the 
moment  array,  cf>{a,  fS...),  of  the  universal,  and  must  therefore  rearrange 
the  above  expression,  as  regards  the  variates  a-,  y  ...,  to  show  only  (j> 
coefficients.  To  do  this,  virtually  we  take  logs  and  then  expand  each 
log  (B  +  e'^S)  in  powers  of  t  by  Taylor's  theorem.  Actually  we  abbreviate 
by  using  the  symbolic  expression  of  the  theorem.  Puttino- /S'=g"^  we 
have  a  function  of  t  +  o-  and  write  symbolically 

d 

Proceeding  thus,  the  totals  have  expressions 
B''-'S'    el21og(i?  +  e'+-) 


Array 


n  —  slsl'  nl 

_  el^e  dT  log  (B  +  e'^) 
nl 
d      .  d 


[by  (68i)] 


^^"^(%7^'^.y;;-)^^^-^^^^^^^ 


e\n ^  [a  j^,  ^-J^---)  log  (^  +  '^') 
: -^ ■; .  ...(G8ii) 

91  I 

Hence  (68 ii)  shows  the  distribution  of  totals,  for  single  sampling, 
as  a  partial  array.  To  extract  the  array,  put  0  =  1  +  ^i ,  so  that  ^]  (a,  ^8 . . . ) 
stands  tor  xa  +  p/3+  ...  +  (xV  +  2uy af3  +  frp'  +  . , . )/2 !  + . . .  in  accordance 
with  (24)  or  (68 i).    Then  the  sample  totals  are 

Array r— ,  : --^ e\n 4>Aa-^.,  3  ^,...    log  (ii*  +  /S'). . . •  (68 ui ) 

Now  if  the  ^i  functional  operator  is  expanded  in  powers  of  a,  (i ... 
and  we  get  a  coefficent  (  7-^)  log(/i^  +  ^'),  this  is  a  series  of  powers  of 
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-jp — ^,  commenciog  with  the  first  power.    If  we  then  expand  the  ex- 
ponential,  we  get  a  series  of  powers  of  ^ — -,  preceded  by  unity.    But 

Array ^:- r^[w—T<)  =1 \n  (68iv) 

''  n-s\s\         n\       \R  +  SJ      {n-y 

where  (s  -Y  stands  for  s  (s  -  1) ...  j  factors.    Thus  the  partial  array  is 

s-  S 

given  simply  by  putting  —  for  in  the  final  expansion  of  the  ex- 

ponential. 

The  result  can  be  expressed  symbolically.    For  let  T=  ^    -^. 

:.  [^^' log  {R  ^-S)={t {I -T)^)\-\og{\-T)],    (r>0) 

and,  making  the  substitution,  the  expression  (68iii)  emerges,  as  ex- 
plained, as  a  function  of  T  which,  in  the  final  power  series,  must  be 

replaced  by  - —  ,  the  powers  of  this  having  the  meanings  given. 

Hence,  if  from  a  population  of  n  having  any  measured  characters 
whose  array  is 

ct>(a,(3...) 

are  drawn  ivhole  samples  of  s,  the  totals  of  the  characters  in  a  sample^ 
on  repetition  of  the  experiment,  have  array 

^n{0[aT(l-r)^„  ^r(i-r)^4...]-i}.i-iog(i-T)}^ 

when  this  is  expanded  in  powers  of  T  and,  in  the  end,  T^  is  replaced  by 

s{s-l) ...  j  factors  jn  {n-l) ...  j  factors.  (68  v) 

To  assist  in  the  calculation  of  such  expressions  as  (68  ii),  (68  v),  we 
find  that,  if 

(§yiog(i^-f>s)=A-.^.+^.(-^; 

then  the  coefficients  Ai ,  k^.-.  run  as  follows  for  r  =  1 ;  2 ;  3 ;  . . .  viz. 
1;  1-1;   1-3  +  2;  1-7  +  12-6;  \         ,        .. 

1-15  +  50-60  +  24;  1  -  31  +  180  -  390  +  360  -  120;  ...J  '  '"^         ' 

In  the  special  case  of  the  population  having  a  normal  moment  array, 
with  zero  means,  by  (39), 
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and  the  moment  array  of  total  characters  in  samples  of  «  is  therefore 

As  an  example  of  the  last  case,  if  there  is  one  character  only, 
normally  distributed,  with  s.d.  o-,  in  the  population  of  n  individuals, 
and  we  require  the  moments  of  the  total  measures  in  samples  of  s, 
these  have  moment  array,  by  (68vii), 


,^^(t^-t,^_^^_ 


\og{l-T)} 


or,  ^n{(r-r)i^V  +  (r-7T^+i2r='-6r4)|<rV+...}  by  (6Svi), 

or,     l+n{T-T-)W~o} 

+  {« (T-  IT"  +  12T'-Qr)  +  n-  {T"  -  2r-  +  T')]  iaV  +  ...  ; 


2nd  mt.        s        sis-l) 
a--  n       n{n-l) 


4th  mt. 


(68viii) 


If,  lastly,  instead  of  taking  a  sample,  we  randomly  partition  the 
population  of  7i  into  groups  of  given  numbers  Si,  s.2 ...  .\, ...,  by  (60),  in 
which  now  l=7n=  ...  =  1,  the  totals  in  the  several  groups  (distinguished 
by  suffixes  1,  2  ...  g  ...)  are 


Array 


§1 !  .<fo !  . . .   '  n\ 


eU  log(e'i  +  '^i  +  g'^+'^"-+...) 

:  —  j  ) 

nl 

the  n  and  2,  as  before,  repeating,  for  every  individual,  the  symbol 
meanings  Ig  =  A,fB,f ...,  v,  =  .ra„  +  if/S,,  +  .... 

*  The  value  of  B.-,,  or  4th  mt./(2ncl  mt.)',  deduced,  accords  with  the  value 
ohtained  by  Isserlis  in  his  paper  'On  the  Value  of  a  Mean  as  calculated  from  a 
Sample,'  Jo.  Roy.  Statistical  Soc.  vol.  81  (1918),  p.  78. 

The  reduction  of  such  expressions  as  (68  viii)  is  facilitated  by  observing  that  the 

expansion  symbolised  by  ( 1 j    must,  by  (68  iv),  be 

For  instance, 

^     g  g      s(s-l)  _(n-s){n-s-l) 
n     n{n-\)  w(n-l) 
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Proceeding  as  before,  we  expand  in  powers  of  tj,  to---  by  Taylor's 
tbeorem  in  its  symbolic  statement  and  get 
d         d 
el-%e'  rf^i  ^  '^  cT^a  +  •  •  •  log  (^'^i  +  g'^-2  +  . . . ) 


e\n  (f) 


[as  is  seen  from  the  identity  (68 i),  viz.  ^e^°-  +2//3  +...  __  ;;(^(a',  fS'...),  by 
putting  tti  -^—  +  a„  -y—  +  . . .  for  a'    B^  -^ — h  B.^  ^ — h . . .  for  B'  etc.] 

•  7?! 

:  (^1  +  ^^ +•••)"  e\n  {cf>  (ditto)  -  1}  log  (;S\  +  /^o  -+-  . ..). 

The  function  (}>  -  1  contains  first  and  higher  powers,  only,  of  the 
operators  and  the  exponent  of  e  contains  therefore  only  powers  and 

products  of  ^ — ^- ,  ^ — ^ . .  ■  and  the  exponential  expansion 

/bl  +  Oo  +  .  .  .       Oi  +  O2  +  •  -  • 

contains  such  terms  preceded  by  unity. 
But,  by  the  multinomial  theorem, 

Si;  S.I--- 

(>S',  +  >SU...)'7  ^^1  W  S,         \*       _(s,-)■'•(6^,-)^^ 


ni  \>S^  +  S,+  ...J  \S^  +  S,+  ..J  '"        {n ->'+*+••■      " 

Hence  the  required  ])artial  array,  that  is,  the  moment  array  of  total 

characters  for  the  specified  partition  s^ ,  s.. ...  is  given  by  writing  ^^  _  for 

■^ — ^ etc.  in  this  exponential  expansion. 

yoi  +  02  +  •  •  • 

Hence,  if  a  population  o/n,  having  any  measured  characters  distri- 
buted as  the  array 

is  randomly  2^cirtitioned  into  groups  {suffixes  1,  2  ...)  of  given  numbers 
SijSo...,  the  totals  of  the  characters  to  he  found  in  the  groups,  in  repeated 
partitioning s,  have  array  obtained  by  expanding 


(     f     S,d         S„d  „   S,d     ^  S.,d  \     J  ,      ,0      o    , 
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in  powers  and  products  of 

S^  S, 


and  putting  for  these 


S,  -        So  - 


and  putting^  finally, 

{Si  -y  =  Si  (si  -  1 )  ■  •  •  i  factors  etc (68  ix) 

In  accordance  with  (26)  the  result  for  single  sampling  (our  first  case) 
is  obtained  by  writing  as,  %  ■••  A---  etc.  zero,  leaving  an  array  in 
aj ,  /ii  . . . ,  and  that  for  double  sampling  in  any  given  numbers  is 
obtained  by  writing  all  the  symbols  zero  but  those  with  suffix  1  or  2. 

If  the  moments  of  means  are  needed,  in  place  of  those  of  totals,  in 
accordance  with  (25)  we  write 

o-ilh,  (3, 1  Si...  ajs.2...  etc. 
for  Oj,  /3i  ...  a., ...  etc. 

If  it  is  required  to  find  the  various  power  and  product  moments,  in 
sampling,  of  the  higher  moments,  vice  the  means  of  the  simple  measures, 
of  characters,  the  formulae  will  give  these,  since  powers  of  measures  may 
be  taken  as  the  characters  considered.  Arrays  are  very  simply  referred 
to  means,  viz.  by  suppressing  the  linear  terms  in  their  exponential 
expression  [see  (39),  (40)  e.g.]. 

Looking  at  the  kind  of  result  recorded  in  the  problems  dealt  with  in 
this  chapter  the  difficulties  are  realized  that  are  likely  to  be  encountered 
in  attempting  to  obtain  the  terms  contributing  to  the  higher  moments 
of  this  class  of  distributions  in  any  piecemeal  way  instead  of  as  a 
whole. 


V.  GEOMETRICAL  DISTRIBUTIONS.  SAMPLES  OF  VECTORS. 
RANDOM  MIGRATION 

1.  The  symbols  A,  B,  C ...,  which,  hitherto,  have  symbolised  unit 
characters  generally,  may  as  a  special  case  stand  for  unit  steps  or 
vectors  in  given  directions.  X,  Y,  Z  may  conveniently  signify  unit 
vectors  parallel  to  the  coordinate  axes  and,  then,  according  to  present 
methods,  X"  will  present  a  step,  parallel  to  the  axis  of  -r,  of  x  units 
and  A^'T^Z^  will  present  the  point  whose  coordinates  are  .r,  y,  z 
relative  to  the  origin. 

3—2 
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The  frequency  array 

will  present  an  assemblage  of  points,  steps,  or  vectors,  Px,y,z  being  the 
probability  or  frequency  of  representative  points  at  x,  y,  z  and 

will  array  the  total  steps  or  resultant  vectors  when  samples  of  w  are 
drawn  from  U  and  added. 

As  before,  the  Greek  letters  t,  v,  C  will  stand  for  log  X,  log  Y,  log  Z 
in  the  algebraical  transformations  which  counterfeit  the  selective  pro- 
cesses. 

2.  If  lines  and  geometrical  extensions  may  be  looked  upon  as 
aggregates  of  closely  and  equally  spaced  points,  they  may  be  shown  as 
frequency  arrays  by  single  expressions. 

Thus  the  line  between  x  =  —l  and  w  =  1  is  expressed  as 

^    ^^,,  dx            /"^    ^e  dx          sinh  ^  , .  _  > 

.1"-—,   or        e^^  —  ,  or  -^ (69) 


'^1         2  '         ;_i       2  ' 

From  this  we  infer  that  the  means  of  randomly  chosen  sets  of  n  points 
in  the  line  has  moment  array 

(^T <«^') 


whence  by  taking  the  coefficients  of 

_  ^2     —  ^*  ^'^ 

2!         4!  6! 

in  the  expansion,  the  2nd,  4th  and  6th  moments  are 

^^,    {\  +  k{n-l)W,    {i4■(,.-l)  +  #(/^-l)(w-2)}A^^-...(70) 

3.  Another  use  to  Avhich  line  arrays  may  be  put  is  in  the  quadrature 
or  smoothing  of  arrays.  For,  if  <^  {^)  is  the  array  representation  of  any 
frequency  curve, 

,.,     sinha^  ,     . 

«^(^)x     -^^|-. (71) 

by  (17),  will  array  the  result  of  taking  an  ordinate  of  the  curve  with  a 
point  on  the  line. 

(71)  therefore  arrays  the  means  of  frequencies  extending  to  the 
distance  a  on  either  side  of  the  ordinate. 
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4.  UA^X^'Y^Z^  ^  =  A- y^Z- are  vectors,^/?  is  the  resultant 
and,  if  a,  p  are  defined  by  A=e%B  =  e\  we  may  say  that  the  resultant 
of  a  and  /?  is  a  +  y8. 

Thus  the  moment  symbols  have  the  characteristics  of  the  vector 
symbols  of  the  usual  theory.    Also 

a  =  x^  +  yr]  +  cC,    /?  =  oc' ^  +  y'-q  +  z'Q 

$,  V,   i  being  the  moment  symbols  of  unit  vectors  along  the  axes, 
defined  by. r  =  e^^,  F=e'),Z=e^. 

5.  The  limited  line  joining  two  points  J,  i?  is  expressed  in  array 
form  as 


\\-^'^^-^Ut^    or   '^, 
Jo  o  —  a 


B~A 


By  a  similarly  constructed  integral  it  may  be  shown  that  the  space 
of  the  triangle  A,  B,  C  has  as  its  expression 

2g"  2e^  2ev 

(a -/?)(-- 7)  "^(^^^cot;^"^"' (7^(7^) ^^^^ 

Since  the  three  middle  points  of  the  sides  have  array 

and  both  expressions  have  expansion  beginning 

1  +  3  («  +  /3  +  7)  +  tV  (a"  +  /3'  +  7'  +  a/3  +  ay  +  /5y)  +  . . . , 

it  follows  tliat  the  first  and  second  order  moments  are  the  same  for  the 
triangular  area  and  for  the  three  mid-points  of  the  sides. 

It  will  be  clear  that  the  points  in  the  lines  and  geometrical  extensions 
ma}^,  on  occasion,  be  regarded  also  as  being  unequally  spaced  or  as  having 
distributions  of  density. 

Thus  if  y  ^-  /3  in  the  above  triangular  area  we  have,  in  the  limit, 
a  line  of  points  whose  density  increases  uniformly  from  zero  at  the  end  a. 
Such  a  limited  line,  then,  will  have  as  its  array 

{--/3r     ^  ^ 

It  may  here  be  recalled  to  mind  that  all  our  arrays  represent  distri- 
butions oi  frequency,  the  total  frequency  being  unity,  or  occasionally 
zero.  There  is  therefore  no  question  of  the  absolute  number  of  points. 
In  the  actual  universal  it  is  generally  infinite :  in  the  frequency  array 
it  is  generally  one  in  number. 

6.  Polar  symbols  in  two  dimensional  distributions.  When  distribu- 
tions of  points  in  a  plane  are  given  in  polar  coordinates  r,  9,  their  arrays 
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may  be  expressed  in  terms  of  two  symbols  t,  v,  defined  as  connected 
with  the  symbols  i,  v  by  the  relations 

^  =  T  cos  V,     r]^r  sin  v,     $'-  +  t?"  =  t-. 

The  symbol  for  the  point  a;,  y  is,  therefore, 

.r-^F^    or   e^^^+n    or   g''^ ^^^ (-^ - '^), 
and  an  array  of  points  will  be  expressed  as 

IJF{r,e)e''^''''^^"-^'^drde,   (75) 

taken  between  definite  limits,  F{r,  0)  drdO  being  the  frequency  of  points 
at  r,  6,  whose  integral  between  the  same,  given,  limits  is  generally 
unity. 

7.  Circular  distributions.  Vector  sampling  ivhen  all  directions  in 
the  2jla7ie  are  equally  probable.  In  polar  symbols,  therefore,  a  circle  of 
radius  r  from  the  origin  is  expressed  as 


P"_l/rcos(.-e)^^^^     or     /oCrr)* 


and  a  distribution  of  points  in  circles  about  the  origin  as 


^{r)=rf{r)L{rr)dr (76) 

.'0 


In  a  great  many  problems  presented  in  physics  we  require  the 
fortuities  resulting  from  taking  samples  of  vectors,  in  known  numbers, 
and  compounding  them.  The  tensors  are  sometimes  given,  but  usually 
they  are  fortuitous,  with  given  frequencies,  and  the  directions,  in  these 
problems,  are  supposed  purely  random,  that  is  to  say,  all  equally 
probable. 

The  universals  are  therefore  equivalent  to  distributions  of  points  in 

*  We  may  define  the  Bessel  function  with  imaginary  argument  and  obtain  its 
expansion  thus : 

In  {^)  [=^,.('-v)/'»]  =  \'^  COS  n-^  .  e-eos>/,  g 

=  j       cos?M/'(  •••+cos«V^  +  ---j.2^ 

/'  2n  (  S !  2  COS  7Jl/'  .t"  \  d\I/     ,        „  > 

=  j^    cos.^(...+^^^,,,:f^-;+...)2j.(«-2'-«) 

and  we  may  deduce  in  like  manner  that   I       sin  n\(/ .  0^^^°^  '*'  —  =  0. 

Jo  "'^ 
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circles,  expressed  as  (76),  and  the  methods  of  arraying  samples  drawn 
from  universals,  described  in  ii,  2,  enable  us  to  write  down  immediately 
the  distribution  of  the  resultant  vector  as  an  array,  ij/  (t),  in  the  tensor 
symbol  t. 

This  represents  another  distribution  of  points  in  circles  and  the 
frequencies  of  the  tensor  r  are  therefore,  by  (76),  to  be  found  by  the 
solution  for  h  (r)  of  the  integral  equation 

^l^{r)^rh{r)l,{rT)dr. 

Jo 

The  solution,  by  Hankel's  formula,  is 

k  (r)  =  f    nv  J,  (?\v)  iP  (Lv)  dx (77) 

Jo 

As  an  example,  if  the  conditions  of  the  problem  state  that  one  step 
of  length  li  is  to  be  taken  from  a  starting  point,  in  random  direction, 
followed  by  a  second  of  length  L,  also  random,  and  so  on,  to  n  steps, 
this  is  the  same  as  taking  one  sample  from  the  circle  /o(^it),  one  from 
Io{l2r),  and  so  on,  and  therefore,  by  (17), 

v//(t)  =  /,(At)   /o(4t).../o(4t) 
will  array  the  resultant  step,  viz.  that  obtained  by  compounding  the 
sums  of  the  characters  A",  Y  to  be  found  in  the  samples. 

We  therefore,  in  this  case,  reach  Kluyver's*  formula  for  frequency 
of  final  excursion 

h  (r)  =  f  rx  J,  {rx)  J,  {k x)  ...J,{hx)dx (78) 

Jo 

If,  on  tlie  other  hand,  as  is  more  usual  in  molecular  and  other 

physical  problems,  tensors  are  fortuitous,  with  given  frequencies  i^(r)6?r, 

we  are  now  taking  n  samples  from 


F{r)h{rr)dr, 

and  these,  by  (16),  will  have  their  resultants  arrayed  as  this  expression 
raised  to  the  power  n. 

Hence,  in  this  instance,  the  frequency  of  resultant  tensor  length  is 

h  (r)  =  frx  J,  (rx)  |  [V (r)  /„  (rx)  d.ij "  dx (79) 

*  For  references  to  Kluyver's  paper  (1905)  see  Pearson  and  Blakeman,  'A  Mathe- 
matical Theory  of  Random  Migration,'  Drapers'  Company  Research  Memoirs,  Bio- 
metric  Series  in  (1906),  and  Eayleigh,  'On  the  Problem  of  Random  Vibrations  and 
of  Random  Flights  in  One,  Two,  or  Three  Dimensions,'  Phil.  Mag.  Vol.  xxxvii 
(1919),  p.  329. 
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8.  Moments  of  tensors  of  circular  distributions.  It  was  shown  in 
II,  3,  that  the  array  of  a  simple  character,  A ,  when  expanded  in  powers 
of  the  symbol  a,  gave  the  .*th  moment  of  the  frequencies  as  the  coefficient 
of  a?ls ! ,  or,  that 

<^(a)  =  1  +  W?ia  + «io— + 

It  appears  from  (76),  by  expanding  /„  (rr),  that  an  array  of  points 
in  concentric  circles,  when  expanded  in  powers  of  the  tensor  symbol  r, 
gives  the  2sth  moment  of  the  tensor  as  the  coefficient  of  T-72's \  s\, 
or,  that 

<A(-)  =  1  +  ^«.^YT-^'^^  2^21"' ^^^^ 

Thus  tensor  arrays  are  moment  ari'ays,  although  in  a  sense  that 
needs  to  be  restated. 

As  an  example  let  us  find  the  moments  of  the  resultant  tensor  in 
terms  of  the  moments  of  the  universal  of  tensors  from  which  samples  of 
11  are  taken  and  compounded  at  random  angles. 

Suppose,  for  instance,  that  '  free  paths '  are  considered,  liable  to  such 
fortuities  that  the  moments  of  the  frequencies  of  length  of  path  are 
m.2,  nii...  and  that  forward  and  backward  and  all  directions  of  path  are 
equally  probable.    Then  (80)  is  the  universal  of  'free  paths.' 

If  11  such  paths  are  taken,  in  sequence,  randomly  chosen  as  to  length 
and  direction,  the  chances  of  final  distance  reached  from  the  starting 
point  will  have  another  distribution  </'(t);  and  if  M.,  M^ ...  are  the 
moments  of  this  distribution,  by  (80), 


But,  by  (16),  xI^{t)  =  {4>{t)Y. 

Hence  the  moments  of  the  final  transference,  after  n  paths  have  been 
traversed,  are  related  to  the  moments  of  the  probability  curve  of  length 
of  path  by  the  equations 

Mi  =  nnii  +  2n  (n-  1)  m^, 

3Is  =  nms  +  dn  {n  -  1)  ?«2^^24  +  6w  (n  -  1 )  (n  -  2)  mJ, 

Ms  =  nnis  +  2n  (n  -  1)  (9?^/  +  8'm.2me) 

+  T2n  (n  -  1)  («  -  2)  w^2"W^4 

+  24:7i{n-l)(n-2){n-3)m.:^ (81) 

Thus,  Avhere  the  solution,  (79),  of  the  problem  of  vector  sampling 
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presents  difficulties,  either  by  reason  of  F{r)  being  observational,  or,  if 
a  known  function,  presenting  difficulties  in  integration,  the  solution 
(81),  by  moments,  will  sometimes  enable  approximate  curves  to  be 
fitted. 

Before  dealing  with  asymmetric  distributions  of  vectors  and  the 
chances  of  resultant  tensor  and  direction  in  taking  random  samples 
therefrom,  a  few  propositions  are  here  given  to  illustrate  how  arrays 
may  give  new  interpretations  to  results  and  means  of  reaching  them. 

9.  The  symmetrical  Gaussian.  If  n;  y  vary  independently,  with 
normal  errors,  the  s.d.'s  being  unity,  the  array  of  points,  by  (35),  is 

g^%  X  e^ '    or   e^   . 

But  the  distribution  is  circular,  the  frequency  of  point  being 

— — ^  g  -  j^"  X  — .^^  g-^sV  X  dxdy, 

J2^  J2^ 


or 

ZTT 


g    ^'^"xrdrdO, 


tbat  is  re    ^''"dr  in  the  zone  dr. 
Hence,  by  (76), 

,    „         /"Sir  ,    „ 

ei-'=      re-^'^I,{rr)dr, (82) 

.'o 

which  is  a  form  of  Weber's  theorem. 

10.  Circular  ai-ea.    An  even  distribution  of  points  within  a  circle  of 

radius  a  from  the  origin,  by  (76),  has  array 

'      ,    .2rdr         2l^(ar)  ,     . 

loit'T)  -o      or (8d; 

^    '    a~  ar 

11.  Uneven  distribution  of  points  on  a  circle.    If  the  points 

ar  cos  {v  -  6) 

on  a  circle  of  radius  a  are  distributed  with  the  harmonic  density 

/^        \'-^^ 
cos  (su  +ys)  :r- 

the  array  is 

/'"cos (sO  -r  y,) /'^ ^°'  {v-O)^    (^,.    /, (,,^)  cos  (sv  +  y,).  ... (84) 

Thus  any  uneven  distribution  of  points  in  a  circle  can,  by  Fourier's 
theorem,  be  shown  as  the  sum  of  harmonic  circular  arrays  such  as  (84). 
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12.   Single  points.    In  particular  the  point  {a,  6),  whose  symbol  is 
y/TCOS  (u-  6) 

is,  by  the  Fourier  expansion,  expressible  as 

/o(«t)  +  2  i  Is{aT)coss{v-e) (85) 

5  =  1 

In  like  manner  the  point  {b,  6  +  C)  is  expressible  as 

I,{bT)  +  2^L{bT)co&s{v-6-C) (86) 

s  =  l 

The  resultant  of  the  two  steps  a  and  b,  inclined  at  the  angle  C, 
has  tensor 

c  =  J(a'  +  b-  +  2ab  cos  C). 
But  the  resultant  of  the  two  steps  (85), 
(86),  by  V,  4,  has  as  its  expression  the  product 
of  (85),  (86)  symbolising  the  steps. 

Let  0  take  all  values  from  0  to  27r  with 
frequency  dO/27r,  a,  b,  C  being  constant. 

Hence  we  conclude  that  the  circle  of  radius  c  equals  the  integral  of 
the  product  of  (85)  and  (86)  into  dOj27r ; 
.'.   In  { J{a^  +  b'"  +  2ab  cos  C)  r} 

=  L  (ar)  I,  (br)  +  2  2  /,  (ar)  /,  (br)  COS  sC,  (87) 

s  =  l 

which  is  the  same  as  Neumann's  theorem. 

13.  Non-circular  distributions.  Vector  sampling  when  directions 
have  bias. 

If,  firstly,  the  frequencies  of  r  and  0  are  independent  of  one  another, 
the  array  of  points  in  a  plane,  shown  generally  in  (75),  will  take 
the  form 


IP 


lf{r)g{e)e''^''°^^"-'^'>drd6l2Tr,       (88) 

and,  if  g  (6)  is  expanded  in  a  Fourier  series, 

g{e)  =  g,+  '^gscos{s0  +  y,), 

s 

the  array  is,  on  integration  with  respect  to  6, 

(f{r){golArr)  +  ^gJArT)cos{sv  +  y,)}dr (89) 

J  s 

Thus  (89)  is  the  array  representation,  in  polar  symbols,  of  any 
distribution  of  points,  in  two  dimensional  space,  for  which  r,  6  are 
independent  variates. 
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If,  secondly,  we  remove  the  restriction  that  gn,  g.,,  y^  are  independent 
of  r,  (89)  may  represent  any  distribution  of  points,  whatever,  in  a 
plane. 

In  either  case,  on  integration  with  respect  to  r  between  the  definite 
limits,  we  obtain  an  array  of  the  form 

<^  (r)  +  2<^,  (t)  cos  (sv  +  €,),    (90) 

in  which  t  is  the  tensor  symbol,  v  is  the  versor  symbol  and  all  else  are 
constants. 

If  from  universals  as  (90)  we  take  samples,  in  given  numbers,  by 
(17)  we  may  at  once  array  the  resultant  points,  or  the  points  whose 
coordinates  are  the  sums  of  the  coordinates  of  the  sampled  points. 
Such  arrays  will  take  the  form  Z7'  V"'W" ...  where  U,  V,  W ...  are  the 
universals  from  which  /,  m,  n  ...  individuals  are  taken  at  random  to  form 
the  sample. 

Thus  any  problems  in  sampling  of  vector  distributions  in  two 
dimensional  space  can  be  solved  so  far  as  to  express  the  result  as  an 
array,  in  the  form  (90). 

Clearly,  since  the  indices  such  as  /,  m  ...  are  integers,  if  the  gonial 
functions  g  {0)  are  expressible  as  a  limited  number  of  Fourier  terms,  the 
resultant  array,  in  such  a  problem,  will  be  expressed  in  a  limited  number 
of  terms  as 

l//  (t)  +  2  if/s  (t)  cos  {.sv  +  7]s), 

s 

with,  now,  different  s  values. 

In  order  to  arrive  at  frequencies,  the  integral  equation  to  be  solved 

is,  by  (89), 

,//  (t)  +  2 1/',  (t)  cos  (sv  +  7?,)  =  j^h  (r)  {hl>  (j-r)  +  ^  kj,  (rr)  cos  (sv  +  k-,)}  dr. 

(91) 

If  (91)  is  an  identity  in  the  symbols  t,  v,  it  follows  that  the  values 
of  s  are  the  same  for  the  two  sides ;  and  that 

and  that  h  (r)  h,  k  (r)  Is  must  satisfy 

^(r)=  r  k(r)hI„(rT)dr, 

Jo 

^s(r)=  1^  h{r)hIArr)dr. 
Jo 


44  GEOMETRICAL  DISTRIBUTIONS.    SAMPLES  OF  VECTORS. 

Hence,  by  the  theorem  already  quoted, 

h  (r)  ko  =  /     ro!  Jo  (r.r)  i}/  {ix)  dx, 
Jo 

h  {r)k^=  I     ;vr  J",  (nr)  i/^,.  {ix)  dx 
Jo 

and  the  frequency  of  resultant  point,  or  vector,  is 

k  (r)  {/■„  +  2/:,  cos  {se  +  77,)}  drde, 
or  « 

[    rx  {Jo  (rx)  <//  (ix)  +  2  /, (rx) .//, (?>)  cos  (sd  +  77,) }  da^  dr d6 /27r. 

-Jo  s  J 

(92) 

14.  Polar  symbols  in  three  dimensional  distributions.  If  x  =  r  cos  6, 
y  =  r  sin  0  cos  ^,  c:  =  r  sin  6  sin  </>,  then  the  symbols  f,  ■//,  C  may  be  replaced 
by  T,  V,  to  defined  by 

^  =  T  cos  V,     r]  =  T  sin  V  cos  w,     ^  =  t  sin  1;  sin  w 
and  the  point  x,  y,  z  therefore  has  symbols 

X^Y^'Z',  or  e''  +  y''  +  '^,  or  /^cos'r^ 

where  rr  is  the  expression  for  the  angle  between  'directions'  r  and  t 
and 

A 

COS  rr  =  COS  ^  cos  V  +  sin  ^  cos  (f>  sin  v  cos  w  +  sin  6  sin  ^  sin  v  sin  w 

=  I  [cos  (v  -  ^)  Jl  +  cos  (o)  -  ^)j  +  cos  (v  +  6){l-  cos  (o)  -  ^)}]. 

(93) 

A  distribution  of  points,  whose  frequency  at  r,  0,  <f>  is 

F{r,e,  4>)drdOd4, 
will  be  arrayed  as 

''(Fir,  e,  (/>)/^^°"'^Vrrf^(^<^, (94) 


taken  between  0  and  =0 ,  or  definite  limits,  for  r,  between  0  and  it  for  0 
and  between  0  and  27r  for  (^. 

15.  Spherical  distributions  and  vector  sampling  when  all  directions 
in  space  are  equally  probable.  Hence  points  uniformly  distributed  over 
a  spherical  shell  of  radius  r  will  have  as  their  expression 


dsin^/^'^^^"-^;^,    or   ^~'^ (95) 


.'0   "  rr 

Thus  the  expression  for  a  spherical  shell  is  the  same  in  form  as  the 
expression,  by  (69),  for  the  polar  diameter,  but  the  interpretation  of  the 
symbol  is  different  in  the  two  cases. 
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A  distribution  of  points  in  space,  the  same  in  all  directions,  will,  in 
consequence,  be  shown  as 

J, ,  ,  sinh  rr  ., 
/  '•)  ~::^dr, (96) 

where /(;•)  dr  is  the  frequency  of  points  at  distance  r. 

Denoting  this  by  <^  (t),  then  ^  (t)  may  represent  a  universal  of  points, 
steps,  or  vectors,  from  which  samples  are  drawn  at  random. 

If  n  samples  are  drawn,  the  resultant  vector,  compounded  of  the 
samplings,  by  (16),  will  have  as  its  array  '/'(t)  =  {0(t)j". 

As  in  the  two  dimensional  case,  ^  (t)  is  a  moment  array,  and,  by  (96), 

</)(r)  =  l+mo  — +  Wi4—  +  (97) 

But  y\i  (r)  is  also  a  distribution  of  points  in  spherical  shells,  and  so 
the  moments  of  the  resultant  tensor  are  given  by  the  expansion 

.A(T)=l+il/,^  +  il/4^+ (98) 

From  the  simple  relation  of  ^  (t)  to  ^  (t)  we  find  the  solution  in 
moments  of  this  problem 

Mi  =  WW4  + 1  w  (w  -  1 )  m.2, 

Me  =  nnis  +  In  (n—l)  nurrii  +  %'- « {n  —  \)  (n  —  2)  mJ, 

Mi  =  nm^  + 1«  (;i - 1)  (21^4-  +  20m2ms) 

+  42'/^  {n -  1)  (n  -  2)  m.^'nh  +  -i-n  {n  -  1)  (n -  2)  (n  -  3)  m.^.  ...(99) 

When,  by  the  methods  of  Ch.  11,  the  distribution  of  the  resultants 

of  samples  drawn  in  any  manner  from  given  universals  such  as  ^(t)  has 

been  expressed  as  an  array,  i/^(t),  in  the  tensor  symbol,  we  pass,  by  (96), 

to  the  solution  in  frequencies,  say  h  (r)  dr,  of  tensor  length,  by  solving 

the  integral  equation 

,  ,  .      /■"  ,  .  V  sinh  rr 
i/'  (t)  =       k  (r)  — —  dr. 
Jo  rr 

The  solution,  by  Fourier's  inversion  formula*  is 


h  fr)  -  -  f "  rx  sin  (r.r)  if;  {ix)  d.v, (100) 

"^  Jo 


*  Fourier's  inversion  formula 

\l/{t)=l    f{r)  sin  rtdr,  f{r)=  -  j    ^{t)smrtdt 
is  the  particular  case  of  Hankel's  inversion  formula 

^  {t)=rj{r)rJJrt)  dr,  /('•)  =  J^  \2'  (0  tJM)dt, 
when  n-l,  in  virtue  of  the  relation  J^{x)l  Jx- J2lir  sin xl-v. 
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in  which  the  imaginary  is,  of  course,  apparent  only,  the  arrays  <k(r), 
i/-  (t)  being  even  functions  of  t. 

In  particular,  if  n  vectors  of  length  /  are  randomly  directed  in  space, 
we  are,  in  effect,  drawing  ii  sample  points  from  the  spherical  surface 
sinh  ItJIt  and  the  array  of  resultants  will,  therefore,  be 
</'(T)  =  (sinh/r//T)". 

The  frequency  of  tensor  length  r  will,  therefore,  by  (100),  be 
/i  (r)  dr,  where 

2  f" 
/i  (r)  ^  -  I    ra.'  sin  rx  (sin  Ixllx)"'  dx, 
"^  Jo 

in  agreement  with  Rayleigh's  formula*. 

16.  Fortuities  of  resultant  tensor  when  n  randomly  selected  tensors 
are  set  in  random  directions  in  a  space  of  s  dimensions.  The  generalisa- 
tion of  the  foregoing  formulae,  given  for  one  t,  two,  and  three  dimensions, 
to  the  case  of  s  dimensions,  directions  being  random,  is  not  difficult  and 
may  be  indicated  briefly  as  follows. 

Clearly  the  denominator,  or  symbol,  for  the  point  is  as  before 

X^YyZ'... ,  or  en^  +  i/'?  +  ^f+-,  or  /^^os't 
where  r  is  the  radius  vector,  t  the  tensor  symbol,  defined  much  as  in 

A 

the  previous  three-dimensional  case,  and  rr,  as  there,  is  the  algebraical 
expression  for  the  angle  between  the  actual  direction  r  and  the  con- 
ceptual direction  t  in  the  s-dimensional  space. 

Calling  this  angle  0,  then  in  order  to  form  the  frequency  array  of  a 
uniform  spherical  'surface'  of  radius  r  in  this  space,  the  frequency,  or 
numerator,  of  the  point  denominated  ^^'^^^^^  is  the  'area'  of  the  zone 
d6  divided  by  the  'area'  of  the  sphere.    Now  if  or^(r)  designate  the 

(IP 
*  Loc.  cit.  p.  341,  formula  (59),  in  which -^^  is,  in  his  notation,  the  frequency 

coefficient  /;  (;•). 

t  Tlie  parallel  theorem  in  one  dimension  leads  to  a  solution  of  the  'problem  of 
moments.'  cosh;r^  or  ^{X^  +  X"'^)  is  a  doublet  of  points  and,  if  \f'(^)  is  an  even 
moment  array,  the  solution  of 

/-yj  2  r°" 

ii(i)=l    /( (.f)  cosh.r^d.r  is  /(,  (a;)=  -  /     corux  .\p {in)  dti. 
J  i)  1"  7  0 

sinh  x^  or  J {X'^  -  X~^)  is  a  doublet  of  zero  frequency  and,  if  x  (s)  is  an  odd  moment 

array,  the  solution  of  x(s)=  I     k{x)sinhx^dx  is  k[x)  —  -  I    sin«.T.7x(i?«)rf».    If, 

then,  0  ($)  =  V' (I)  4- X  (s)  is  any  moment  array  as  (23),  divided  into  even  and  odd 
arrays,  the  solution  in  frequencies  is 

2  /■'" 
f{x)  =  h{x)  +  k{x)  —  -  I     {cos  ux  .\l/(iu)  + sin  ux  .ix{ii()]  i^it (105) 

TT  /  0 
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'surface'  of  a  sphere  of  radius  r  in  ^^-dimensional  space,  the  'area'  of 
zone  dO  is  rd9 .  a-s_i  (r  sin  0)  and  the  frequency  of  tlie  point  is  therefore 
rdO  .  (r^_i  (r  sin  0)/ag  (r). 

Clearl)'  this  is  sin^' - 0 dO  / B(~,  '—^),  since  the  power  of  sin  (9 

follows  from  dimensions  and  the  divisor  follows  from  the  consideration 
that  the  integral  from  0  to  tt  is  unity. 

Thus  a  spherical  'surface'  of  radius  r  is 


0  ^fi  'l^V  '  (b^^)^'~^ 


.(101) 


•^■«^-7Tr.^-i--^^-'    (1^2) 


,2'     2 

This  expression  reduces  to  loi^rr)  when  s  =  2  and  to  sinh rr/rr  when  . 
s  =  8  (see  foregoing  note)  thus  giving  the  arrays  of  circle  and  spherical 
surface  already  found  in  the  two  and  three  dimensional  cases. 

Proceeding  as  before,  a  distribution  of  points  in  uniform  generalised 
spherical  shells  is  shown  as 

-|5-l!/^,_l(rr) 
.'o  "  ^  '         (hrr)l 
f(r)  dr  being  the  frequency  of  points  in  shell  dr  at  distance  r. 

This  being  a  frequency  array,  any  problem  requiring  the  distribution 
of  resultants  of  samples  obtained  in  any  manner  from  such  universals  is 
at  once,  by  the  methods  of  Ch.  ii,  expressed  as  a  kindred  frequency 
array.  For  instance,  if  ^(t)  stands  for  the  above  universal  and  samples 
of  n  are  taken  at  random  as  to  length  and  direction,  the  resultants  have 
arraj' 

Secondly,  sucli  arrays  as  <^  (t),  t//  (t)  are  moment  arrays  and  when 
expanded  in  powers  of  the  symbol,  t,  give  the  moments  of  the  tensor 
length  as  certain  coefficients.    In  the  present  case,  by  (102), 

</)(t)  =  1  +  m.2 — n  +  *'^4^ ,-,x    ,-,    ,  + (103) 

^^^  S.2  s(s  +  2).2.4 

And,  thirdly,  from  the*solution  of  the  problem  as  a  frequency  array, 
if/  (t),  we  may  obtainj  the  frequencies,  k  (r)  dr,  of  the  tensor  length  by 
solving  the  integral  equation 

•A  (t)  =       Kr)  '--7T-|r7Zl f^'"' 

Jo  (A^'t)- 

and  the  solution  is,  by  a  further  application  of  Hankel's  formula, 
r^  (rxy-^Ji  <.  _  1  (rx) 
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In  the  foregoing  pages  illustrations  are  given  of  the  assistance  that 
objective  symbols  can  render  in  the  specification  of  numerical  aggre- 
gates, whether  seen  in  a  single  or  manifold  classification.  Such  symbols 
are  of  the  simplest  character,  answering  as  they  do  to  the  common  noun 
in  speech,  and  it  is  not  surprising  to  find  that  their  use  brings  to  the 
work  of  analysis  a  brevity  that  the  algebraical  symbols,  with  their  more 
involved  connotation,  fail  to  give.  That  the  equation  f{x,  y)  =  0  aptly 
describes  a  relation  between  the  number  x  of  the  literal  ^Y"  and  the 
number  y  of  the  literal  Y  is  beyond  doubt,  but  if  the  purpose  be  merely 
to  state  that  there  are  15  of  Xand  20  of  Y  then  either  the  eayression 
\oX+  20  Y  or  the  symbol  X^"  F"^'  appear  to  fulfil  such  requirements  in 
a  sufficient  manner  and  more  simply  than  the  equations  x=--\b,  3/ =  20. 

The  aim,  then,  has  been  to  recommend  the  use  of  logical  symbols  in 
the  enumeration  of  logical  classes.  It  has  been  no  material  part  of  the 
purpose  to  establish  new  formulae  and  results  in  the  mathematical  theory 
of  statistics  and  if  new  conclusions  have  been  reached  these  will  serve 
chiefly  to  help  point  the  precept,  since  no  difficult  analysis  has  been 
undertaken  or  is  anywhere  involved. 

It  is  possible  that,  with  wider  familiarity  in  their  use,  the  applications 
of  the  symbols  of  denomination  may  bear  extension  and  that  they  may 
be  found  of  assistance  in  the  development  of  the  higher  theory  both  of 
statistical  and  other  distributions. 
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